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GENERAL  INTRODUCTION 


The  final  report  chronicles  various  aspects  of  the  research 
conducted  by  the  principal  investigator  on  the  wavefront  decon¬ 
volution  problem.  Only  work  carried  out  since  the  interim  report 
RADC-TR-80-154 ,  May  1980  was  released,  is  discussed  because  the 
material  in  the  interim  report  is  complete  in  itself.  The  work 
discussed  here  naturally  forms  four  major  interrelated,  but 
separate  research  items,  and  each  topic  is  the  subject  of  a 
separate  section  of  the  final  report.  In  addition,  there  is  an 
appendix  devoted  to  a  minor  item  which  evolved  during  the  course 
of  the  investigation  but  is  not  directly  related  to  the  major 
items. 

The  four  major  sections  and  appendix  are  entitled: 

Section  1:  Imperfectly  known  Fourier  transform  pairs 
with  application  to  deconvolution. 

Section  2:  Interf erogram  reduction  for  aberrated 
annular  apertures  via  Zernike-Barakat  functions  using 
L2  and  norm  algorithms. 

Section  3:  Upper  and  lower  bounds  on  radially  symmetric 
optical  transfer  functions. 

Section  4:  Moment  estimator  approach  to  the  retrieval 
problem  in  coherence  theory. 
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Appendix :  The  Shannon  number  of  an  incoherent 
diffraction  image. 

Finally,  there  is  a  section  which  summarizes  and  integrates  the 
technical  work  performed  during  the  contract  since  its  inception. 


ii 


SECTION  1 


IMPERFECTLY  KNOWN  FOURIER  TRANSFORM  PAIRS 
WITH  APPLICATIONS  TO  DECONVOLUTION 


ABSTRACT 


An  algorithm  for  extrapolation  of  partially  known,  band- 
limited  functions  based  upon  singular  value  decomposition  of 
finite  Fourier  transforms  is  outlined.  The  decomposition  is  used 
to  make  rational  choices  for  a  discretization  of  the  problem  and 
to  make  prior  estimates  of  errors.  The  algorithm  provides  a 
systematic  approach  to  the  ill  posed  problem  of  extrapolation  of 
noisy  data.  In  this,  it  is  an  improvement  over  previously  pro¬ 
posed  iterative  algorithms  such  as  the  G-S  algorithm.  Some 
numerical  results  are  presented  for  two  situations  (with  obvious 
optical  diffraction  interpretations)  to  illustrate  the  main 
features  of  the  algorithm. 
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1. 


INTRODUCTION 


The  retrieval  of  wavefront  aberrations  from  the  measured 
point  spread  function  of  an  optical  system  has  been  the  sub¬ 
ject  of  several  recent  papers  since  the  important  work  of 
Gonsalves  [1].  Gonsalves  employed  two  different  retrieval 
methods:  parameter  search  technique  and  the  Gerschberg-Saxon 

(GS)  iterative  technique.  Both  methods  are  quite  sensitive 
to  noise  in  the  measurements.  An  alternative  method  was 
developed  by  Barakat  and  Newsam  [2]  in  research  sponsored 
by  RADC,  which  attacked  the  problem  as  a  nonlinear  least 
squares  estimation  problem  using  a  filtered  version  of 
singular  value  decomposition.  The  BN  approach  rests  upon  two 
provisos:  (1)  the  aberrated  wavefront  itself  is  the  primary 

objective  of  the  inversion  and  not  an  assumed  functional 
form  of  it  as  Gonsalves  [l],  and  Southwell  [ 3 ] »  (2)  the  nonlinear 
inversion  method  is  tailored  to  be  robust  with  respect  to  noise 
in  the  measured  point  spread  function.  The  representative 
calculations  discussed  and  displayed  in  [2]  show  ample  testi¬ 
mony  to  the  usefulness  of  this  mathematically  elaborate  approach. 
The  extension  of  the  BN  approach  to  time  adaptive  imaging  is 
presently  under  investigation. 


i 
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Given  these  preliminary  remarks,  we  are  convinced  that 


the  GS  algorithm  or  some  alternative  has  potential  usefulness 
in  the  active  optics  deconvolution  scheme.  Private  discussions 
with  several  investigators  who  have  employed  the  GS  algorithm 
(some  in  the  optics  area  and  others  in  the  electron  micro¬ 
scopy  area)  have  indicated  a  sort  of  hit-or-miss  attitude  with 
respect  to  the  behavior  of  the  algorithm  in  various  situations. 
Sometimes  the  algorithm  works  and  sometimes  it  fails  when  the 
data  are  noisy.  With  possible  exception  of  Youla's  recent 
study  [^],  there  are  really  no  serious  attempts  to  understand 
the  stability  of  the  GS  algorithm  with  respect  to  noise  in  the 
measurements.  The  work  of  Papoulis  [5],  although  extremely 
Interesting  simply  does  not  address  these  types  of  Questions 
which  are  so  vital  if  GS  type  algorithms  are  to  be  used  in 
deconvolution  schemes  of  adaptive  optics. 

We  feel  that  an  alternate  algorithm  can  be  developed  which, 
as  in  the  BN  algorithm,  can  be  made  robust  with  respect  to 
noisy  data  by  starting  ah  initio.  The  purpose  of  this  chapter 
is  to  outline  some  aspects  of  the  new  algorithm  and  to  present 
numerical  data  for  one  typical  situation.  Because  the  current 
contract  has  been  terminated  extensions  of  the  analysis  to  more 
complicated  situations  is  held  in  abeyance. 
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2. 


MODEL  PROBLEM  AND  ASSOCIATED  THEORY 


The  model  problem  for  extension  of  finite  Fourier  transforms 
is : 

A 

Given  a  function  g(v)  measured  on  veAi[a1,a2]  that  is  known 
to  be  the  Fourier  transform  of  a  function  G(u>)  which  has  support 

A 

contained  in  B  =  [b  ,b  ],  extend  g(v)  to  a  function  g(v) 

2  2 

defined  on  the  entire  real  axis. 

/v 

Because  B  is  a  bounded  set,  g(v)  is  analytic  by  the  Paley- 
Wiener  theorem  [6]j  consequently  a  unique  extension  exists. 
However,  the  problem  is  therefore  one  of  analytic  continuation 
which  is  known  to  be  highly  sensitive  to  measurement  error  in 

A 

g(v),  see  [7,8].  This  is  a  strong  argument  for  an  algorithm 
with  a  clear  direct  treatment  of  noise  included  as  part  of 
the  formulation. 


The  following  notation  is  introduced. 

Definition  1.  Let  G(co)eL2,  the  Fourier  transform  g(v)  of  G(w) 
is  defined  as 


g(v) 


,co 


J 

—  CO 


e27rivw 


G(w)dw 


(2.1) 
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£La, 


and  denoted  by 


g  =  FG 


(2.2) 


The  inverse  transform  is  denoted  by 
G  =  F~  1  g 


(2.3) 


Definition  2.  Let  S  be  a  subset  of  the  real  line.  The  pro¬ 
jection  operator  on  L2  associated  with  S  is  denoted  by  Pg  and 
defined  by 


(P  f)(v)  =  f(v)  veS 

5 

=  0  v^S 


(2.4) 


In  this  notation,  the  model  problem  is:  Given  a  measure- 
ment  g  find  g  and  G  such  that 


g 


VPAG 


g  =  FPaG 


(2.5) 


The  problem  can  be  symmetrized  using  simple  translations  and 
scaling  to  read:  Given  a  measurement  g  find  g,  G  such  that 

fi  =  PC^PCG  ,  g  =  PPCG  (2.6) 


where 


C 


1 

[-C,c],  c  = J  [(a2-aj ) (b2-bj ) ]2 


(2.7) 
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In  order  to  make  quantitative  judgments  on  noise  and  ill 
conditioning  in  Eq.  2.6,  we  shall  use  the  concepts  of  singular 
value  decomposition.  We  assume  the  reader  is  familiar  with 
this  method  of  analysis  applied  to  either  infinite  dimensional 
compact  linear  operators  or  to  finite  matrices;  a  discussion 
of  the  first  case  can  be  found  in  Baker  [9],  of  the  second  in 
Stewart  [10]. 

In  a  series  of  papers,  Landau  et  at.  [11,12],  the  struc¬ 
ture  of  PC^PC  was  fully  exposed.  The  key  results  are  repeated 
here  for  convenience: 

1)  PG^PC  is  a  normal  compact  operator,  and 

| |PcEPcI|2  1  1  (2.8) 

2)  The  eigenfunctions  of  pcppc  can  be  taken  as  real. 

They  are  then  the  prolate  spheriodal  wavefunctions  here  denoted 
by  <J>  (v,c),  making  explicit  the  dependence  on  c. 

3)  The  eigenvalues  An(c)  satisfy 

A  (c)  -  e1,n/' | »  |  <2'9) 

n  1  n' 
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4)  The  singular  values  are  denoted  by  a  (c)  with 


°n(c)  H 


X  (c) 
n 


(2.10) 


They  behave  as  follows 


a.  For  a  fixed  c  and  increasing  n 


on(c)  ~  (c2/n) 


(2.11) 


b.  For  a  fixed  n,  there  exist  constants  an ,  8n 
such  that 


-8  c2 

an(c)  -  1-ane  »  Vc 


(2.12) 


Since  P  FP  is  compact,  it  has  a  singular  value  decomposi- 
c  c 

tion  U,  E,  V  where  U  and  V  are  complete  sets  of  orthonormal 
functions  and  E  is  the  decreasing  sequence  of  nonnegative 
numbers  defined  in  Eq.  2.10.  Since  PcFPc  is  normal. 


U 


{ "A1'-' 


V 


(2.13) 


where  un  and  vfi  are  constants  such  that  lunl=lvn!=l*  With  this 
machinery,  the  solution  to  the  problem,  see  Eq.  2.6,  can  theo- 

/v 

retically  be  determined  by  expanding  g  and  G  as 
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g  =  E  a  v  d>  ( v ,  c  ) 

ZD  n  rt  t  y\  '  >  ' 


n=0 


n  nTn 


G  =  E  b  u  (J)  (v,c ) 
n=0  n  n  n 


(2.14) 


(2.15) 


determining  the  coefficients  directly  from  the  known  g  and  the 


b  from  the  relations 
n 


g  =  P  FP  G  <= 
°  c  c 


=>  P  FP  (b  u  o  ) 
c  c  n  n  n 


(2.16) 


which  in  turn  imply 
a  v 


b  n  t  n 

n  b  u 
n  n 


(2.17) 


In  this  representation  of  the  solution,  the  degree  of  ill 

conditioning  and  the  effect  of  noise  in  Eq.  2.16  are  deduced 

from  the  behavior  of  a  .  Since  P  FP  is  compact  a  0; 

n  c  c  n 

therefore  a  small  error  in  the  coefficient  an  of  a  high  fre¬ 
quency  (large  n)  component  anvn'J)n(v  >c )  of  g(v)  will  induce  a 
large  error  in  the  coefficient  b  ,  as  given  by  Eq .  2.17,  of 
the  corresponding  component  of  G.  If  the  random  error  in 

A 

measurement  is  assumed  to  appear  over  all  components  of  g, 
then  the  calculated  high  frequency  components  of  G  are 
not  to  be  trusted.  But  as  the  decomposition  allows  a 
direct  observation  of  the  effects  of  noise,  so  also  does  it 
allow  direct  action  to  reduce  these  effects;  a  filter  f(o,e) 
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dependent  on  the  noise  level  e  in  g  is  introduced  so  that  G  is 
estimated  by  G' 

oo  /  v  \ 

G  '  (to)  =  I  a  MM  f(o  )A  (w,c)  (2.18) 

n=0  \  nf  n 

where 


f(a  )  -*■  (a  )-1  as  a  -*■  1 
n  n  n 

->•0  as  a  +  0 

n 


(2.19) 


Choice  of  filters  is  an  art,  so  we  shall  only  note  that  if 
f(a  ,e)  =  0  for  a  <  a„T  then  the  system,  Eq .  2.18,  is  said  to 
have  N-l  essential  degrees  of  freedom  [21],  these  can  be  used 
as  an  estimate  of  ill  conditioning,.  A  more  general  estimate 
is  the  rate  of  decay  of  o^,  the  larger  the  rate  of  decay  the 
more  ill  conditioned  the  system. 

In  view  of  this  analysis,  the  major  conclusions  to  be 
drawn  are : 


1.  PCFPC  is  badly  ill  conditioned  since  ®n(c)  exhibit 
exponential  decay. 

2.  For  a  given  c  and  noise  level  e,  the  number  n(e,c)  of 
singular  values  significantly  greater  than  e  can  be  bounded 
above  by  Eq.  2.12. 
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3.  For  a  given  c,e  the  singular  values  are  approximately 
distributed  as  a  step  function,  i.e., 

n  <  n(e ,c  )  =>  a  ( c)  ~  1 

n  (2.20) 

n>n(e,c)  =>  an(c)  ~  0 


The  algorithm  proposed  is  based  upon  the  following  procedure, 

For  a  given  c  and  e,  n(e,c)  is  calculated.  If  it  is  suffi¬ 
ciently  small,  then  accurate  numerical  approximations  to  on,$n 
( v , c )  are  calculated  for  n  <  n(e,c)  and  G  is  estimated  by  Eq.  2.18 
using  a  filter  of  the  form 


f  (a,e) 


1 

a 

0 


a  >  e 
a  <  e 


(2.21) 


The  algorithm  is  optimal  in  that  it  focuses  on  accurately  cal- 
culating  components  of  g  that  are  significant  in  estimating  G 
and  ignoring  components  that  are  useless  for  estimation. 

We  conclude  this  section  with  a  discussion  of  the  theory  of 
the  Gerschberg-Saxton  algorithm  from  our  viewpoint  and  notation  [13]. 
The  algorithm  estimates  G  ^  gn+1  from  Gn,  gn  by  the  following 
update 

g  ..  =  P-FP  G  +  g,  g  =  0  (2.22) 

&n+l  c  c  n  65  &0 
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Gn+1  '  V~‘®n+1  •  Q.  *  0  (2.23) 

where  C  is  the  complement  of  C. 

Concatenating  these  two  equations  yields 

G  =  P  F~ 1 ( P-FP  G  +  g),  G  =  0  (2.24) 

n+1  c  c  c  n  &  *  0 


which  in  turn  implies 

G  ,  =  (I-P  F~ 1 P  FP  ) G  +P  F~ 1 P  g,  (2.25) 

n+1  c  c  c  n  c  c&s 

noting  that 

PG  =  G  ,Pg=g  (2.26) 

c  n  n  ’  c&  B 

Pc  +  P—  =  I  (identity  operator)  (2.27) 

Equation  2.25  has  a  computational  advantage  over  the  previous 
equations  in  that  it  only  requires  evaluation  of  functions 
on  the  bounded  set  C  instead  of  the  infinite  domain  C. 

Equation  2.25  is  recognizable  as  the  simplest  form  of 
iterative  solution  of  the  normal  equations  associated  with 
Eq.  2.6,  that  is 
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(2.28) 


P  F~ 1 P  g  =  P  F~ 1 P  F P  G 
c  c  c  c  c 

The  normal  equations  are  usually  not  formed  for  ill  conditioned 

systems  since  they  have  even  poorer  conditioning  because  if 

P^FP^  has  singular  value  cn,  then  the  singular  value  associated 

with  P  F~ 1 P  FP  is  a2.  However  since  the  a  display  the  step- 
cccn  n 

like  behavior  noted  above,  for  almost  all  significantly  non¬ 
zero  singular  values  an  ~  an  ~  1>  so  little  is  lost  in  con¬ 
sidering  Eq.  2.6. 


In  the  presence  of  noise,  Eq .  2.25  is  usually  modified  so 
that  it  either  stops  when 


I G  -  G  ...  I  <  6 
1  n  n+1 1 1  — 


(2.29) 


or,  a  X  >  0  chosen  and  the  revised  scheme 


G  ,  =  (l-X)G  -  P  F~ 1 P  FP  G  +  P  F~ 1 P  g 

n+1  v  '  n  c  c  c  n  c  c& 


(2.30) 


used,  where  6  and  X  depend  upon  the  noise  level  e.  The  second 
algorithm,  if  iterated  to  completion,  corresponds  to  a  choice 
of 


f(o) 


1 

o+X 


(2.31) 


in  Eq .  2.21.  However,  this  is  impossible  in  practice,  and  since 
there  Is  no  "good"  theory  linking  termination  after  N  steps  with 
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noise  levels  e,  the  clear  insights  used  to  modify  singular 
value  decomposition  solutions  in  the  presence  of  noise  cannot 
be  applied  to  these  alternative  schemes.  One  important  con¬ 
sequence  of  this  lack  of  theory  is  that  it  is  not  possible  to 
tell  a  priori  how  much  computation  is  needed  to  reach  a  desired 
solution,  since  neither  the  number  of  iterations  or  the  accuracy 
at  each  such  iteration  can  be  determined  as  a  function  of  known 
noise  levels. 


There  is  a  large  literature  on  alternative  techniques  for 
Eqs.  2.6  and  2.28  which  is  summarized  in  Patterson  [14],  The 
most  attractive  is  the  conjugate  gradient  method 

rn  *  V'1  PcS  -  V‘  V  °n  •  (2-32) 

$ 

where 


^n+1  ^n  +  anpn 
pn+l  =  rn+l  +  Bnpn  > 


and 


a 


n 


I'Wnll2 


M-nll2 


The  initial  conditions  are 


r .  =  P  1  P„g  ,  G  =  0,  p  =  r 
0  c  c"  *  0  ^0  0 


(2.33) 


(2.34) 


(2.35) 


We  mention  it  here,  as  it  is  both  an  iterative  algorithm  and  a 
Galerkin  approximation  and  so  can  be  examined  using  the  analysis 
of  the  next  section. 


3. 


DISCRETE  PROBLEM  AND  ASSOCIATED  THEORY 


This  section  is  concerned  with  the  formation  of  finite 
dimensional  approximations  to  Eq.  2.6,  their  solution,  and  the 
relative  merits  of  different  approximations  to  Eq.  2.6.  Since 
the  natural  error  metric  for  Eq.  2.6  is  ||*||2,  all  discretiza¬ 
tions  considered  are  derived  using  Galerkin's  method.  The 
procedure  is  to  choose  two  sets  of  linearly  independent  functions 

<VK  .  IVL  ;  I  l*J  I  -  lie, ||  =  i  (3.D 

k=l  1=1  k  1 

and  define  the  following  quantities: 

Definition  3.  Let  S„  denote  the  subspace  spanned  by  {iL  }  , 

^  K  k=l 

SL  the  subspace  spanned  by  {e^}1"  >  let  denote  the  projection 

1 

operator  from  Lz  onto  SK,  QL  the  projection  operator  from  L2 
on  S^. 

Approximat ions 

A  A 

SK  eSK  g,GL  eSL  G  (3.2) 

are  now  generated  by  requiring  that 

SK  =  PKg  5  PKg  =  PKPc  FPcGL  (3.3) 

These  equations  can  be  cast  into  matrix  form 
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Bb  =  c  ,  c  =  Da 
by  defining 


(3.4) 


*  K  L 

SK  =  Vk  5  °L  =  £^1  aiQH 

/v 

—  ('I^jS)  >  ^ij  ~ 


( 9i  ’  9j  )  5  Dk2.  ~  ^k»PcFPc0P  ’ 


where  (*,*)  denotes  the  inner  product  on  L2. 


(3.5) 


Each  such  Galerkin  approximation  implicitly  defines  an 
associated  finite  rank  operator  RRL  such  that 


R 


KL 


"  Vc^L 


(3.6) 


PKg  RKLGL  '  (3.7) 

RKL  ^ePen<*s  only  upon  the  subspaces  S^,  and  not  on  the 
particular  choice  of  bases  {4^},  C © ^ 3-  as  do  the  matrices  A,B,D. 

In  the  formulation,  there  is  considerable  freedom  of  choice  for 
these  subspaces  and  bases;  it  is  therefore  appropriate  to  adopt 
criteria  by  which  a  particular  choice  can  be  judged.  Five  such 
criteria  will  be  used,  but  before  judgment  is  made  it  is 
necessary  to  distinguish  between  the  presence  and  absence  of  noise. 
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A  choice  is  first  made  in  the  absence  of  noise,  then  in 
light  of  a  known  noise  level.  The  criteria  are: 

1.  Aoouraoy  of  approximation:  It  seems  reasonable  to 
require  that  a  particular  approximation  be  part  of  a  sequence  of 
approximations  that  satisfy 


lim 

K,L-*» 


-Vp0H  *  0 


(3.8) 


If  Eq .  2.6  possesses  a  solution  G,  then 


lim 

K,L-k» 


0 


(3.9) 


These  conditions  may  be  difficult,  if  not  impossible,  to  prove  for 
the  general  choice;  however,  the  following  proposition  (stated 
without  proof)  shows  at  least  one  such  basis  exists: 


Proposition  1:  If  K  =  L  =  N  and  ^k,  9^  are  chosen  so  that 
4*k(v)  =  9k(v)  =  <j>k(v)  then  the  Galerkin  approximation  generated 
by  Eq.  3.3  satisfies  Eqs.  3.8  and  3.9.  Furthermore, 


RNN_PcFPc 


R 


K'L 


f-P„FP„ 
’  c  c 


for  any  other  choice  of  Sj.,,  S^,  with 

dim(SRI)  =  K'  <  N  ;  dim(SL,)  =  L?  <  N 


(3.10) 


(3.11) 
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2.  Conditioning  of  A,B,D:  The  previous  criterion  indicate 
whether  or  not  S^,  are  good  choices  as  approximating  subspaces, 
but  says  nothing  about  a  particular  choice  of  bases  within  those 
subspaces.  It  is  possible  to  choose  0^  so  that  D  is  well 

conditioned;  but  this  is  an  illusory  gain  since  the  matrices  B 
and/or  A  will  then  be  ill-conditioned.  Therefore,  calculations 
of  g^(v),  G^(v)  ma-de  using  the  representations  given  in  Eq.  3.5 
will  be  inaccurate.  There  is  not  a  good  figure  for  comparison 
of  ill-conditioning  induced  by  different  bases,  but  some  indica¬ 
tion  of  the  dependence  follows  from  propositions  2  and  3  (stated 
without  proof): 

K  L 

Proposition  2:  Let  S„,  ST  be  fixed  subspaces;  }  ,  {0.} 

k=l  4=1 

K  T 

be  orthonormal  bases  for  S.,,  ST  .  Also  let  {ipf}  ,  {0J}  be 

*  ^  *  K  k=l  *  4=1 

be  any  other  bases  for  SK,  SL«  If  A,  AT,  B,  B',  D,  D'  are  the 
corresponding  matrices  defined  by  Eq.  3.5,  then 

det(B'D'A’)  <  det(BDA)  =  det(D)  .  (3.12) 

Proposition  3:  Let  ^  =  0^  =  K  =  L  =  N  and  D  be  the  matrix 

formed  in  Eq.  3.5.  Let  { K  {0J}L  be  any  other  choice  of 

K  k=l  *  i=l 

orthonormal  basis  such  that  K’,  L'  _>  N,  then 

det(D)  >  det(D')  .  (3. 13) 


1-17 


3. 


Desirable  special  results:  Particular  choices  of 
0 ^  give  special  results  that  are  often  desirable.  If  both  bases 
are  orthogonal,  then  it  is  easily  shown  that  the  singular  values 
of  D  are  the  same  as  those  of  which  in  turn  approximate  the 
first  few  singular  values  of  P  fP  .  If  K  =  L  and  \p.  ,  0,  are 

O  O  rV  jV 

chosen  so  that 


=  PcFPc9k 


(3.14) 


then  the  Galerkin  approximation  is  also  a  least  squares  solution, 
i  .  e .  , 

I  Is-PcFPcGlI  I  -  l|g-P</PcH||  ,  VHeSL  .  (3.15) 


Again  if  the  Galerkin  approximation  Is  generated  using  the  eigen¬ 
vectors  as  the  bases  then  both  these  results  hold. 


4.  Ease  of  computation:  In  spite  of  the  advantages  of 
using  eigenfunctions  to  form  an  approximation  as  outlined  above, 
we  do  not  consider  their  use  in  actual  computation  because 
there  is  no  known  simple  algorithm  for  their  numerical  evaluation. 
Since  numerical  evaluation  of  Fourier  transforms  and  orthonormal¬ 
ization  are  computationally  expensive,  it  is  convenient  to  choose 

{0„}P  to  be  an  orthonormal  set  of  simple  functions  whose 
*  1=1 

Fourier  transforms  can  be  evaluated  analytically .  However,  the 
finite  Fourier  transform  of  a  simple  function  is  itself  often 
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not  simple,  so  the  inner  products  » pcfPce £ )  must  ultimately 
be  done  numerically. 

5.  Ease  of  extrapolation:  Because  it  is  desired  to 

A 

extrapolate  g(v)  to  a  function  g(v)  on  the  entire  real  line,  a 
choice  of  simple  0^  will  allow  analytic  computation  of  the 
approximation 

eL  '  F?cGh  <3.16) 

to  g;  numerical  computation  could  again  be  expensive. 

For  the  noiseless  case,  the  issues  raised  by  the  above 
criteria  lead  to  the  following  sequence  of  bases  and  Galerkin 
approximations  to  the  solution  of  Eq.  2.6.  First,  choose  a 
det{p,  }°°  that  is  a  complete  orthonormal  basis  of  simple 

k=l 

functions  for  L2(c)  with  the  property  that  each  eigenfunction 

can  be  expressed  as  a  rapidly  convergent  series  in  pk>  If 

possible,  also  require  that  PcFPcpk  is  close  to  ckPk,  where  ck 

is  a  constant.  Second,  for  a  given  N  choose  a  K(N)  such  that 

the  first  N  eigenfunctions  are  well  approximated  by  {p, 

K  k=l 

Choose  t|Jk  =  0k  =  Pk  and  form  the  c  and  D  of  Eq.  3.5.  Find 
the  singular  value  decomposition  of  D  =  U£V+.  It  can  be 
shown  that  if  the  approximation  is  sufficiently  accurate,  the 
largest  N  singular  values  and  associated  singular  functions  are 
the  approximations  to  the  first  N  o  ,  <J)  .  Therefore,  the 


remaining  singular  values  of  D  are  set  equal  to  zero  and 

the  system  c  =  Da  solved  in  a  least  squares  sense. 


a  =  VfU  c  , 


(3. 17) 


where 


i® 

kk 


lf  Ekk  11  0 


=  0  ir  1. ,  »  0 

kk 


(3. 13) 


If  the  above  conditions  on  {p^.}  are  satisfied,  one  can  show  that 
the  sequence  of  approximate  solutions  G^  so  generated,  converges 
to  the  true  solution  G  if  it  exists. 

In  the  case  of  a  known  noise  level,  the  two-fold  sequence 
listed  in  the  first  part  of  the  previous  paragraph  is  combined 
with  the  approach  outlined  in  Sec.  2.  Instead  of  generating  a 
convergent  sequence  of  approximations,  a  single  approximation 
is  found  that  is  as  accurate  as  possible  in  the  presence  of  the 
given  noise  level.  The  number  N  =  n(e,c)  of  singular  values 
significantly  greater  than  e  is  calculated  and  K(N)  Is  chosen 
so  that  the  first  N  an  <J>n  are  approximated  to  within  some 
tolerance  6(e)  by  {p^}.  D  is  formed  and  decomposed  as  in  the 
noise-free  case;  however,  ^  is  redefined  as 
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& 


=  T~ 1  if  r  >  f 
kk  ^kk  1  ^kk  e 


-  0  if  Ekk  <  c 


(3. 19) 


then  the  approximation 
K(N) 

°N  *  J,  Vk 


(3.^0) 


is  formed  and  g  =  fG..  is  evaluated. 

N  N 

The  construction  of  an  algorithm  for  the  inversion  of  Eq . 

2.6  in  the  presence  of  noise  is  now  complete.  The  salient 
features  are:  use  of  the  singular  value  decomposition  of  PcPPc, 
prior  estimation  of  the  number  of  degrees  of  freedom  n(e,c) 
present  at  a  given  noise  level,  and  construction  of  a  Galerkin 
approximation  from  a  predetermined  number  of  functions  that  are 
computationally  simple  but  also  approximate  the  first  n(o,e) 
eigenfunctions . 

We  conclude  this  section  by  noting  that  the  conjugate 
gradient  method  discussed  at  the  very  end  of  Sec.  2  is  also  a 
Galerkin  approximation,  where  4^  =  0k  =  rk’  *'or  s°lutl°n  °P 

the  normal  equations,  see  Eq .  2.28.  Although  it  has 
many  of  the  features  highlighted  by  our  five  criteria  (e.g.,  it 
produces  a  sequence  of  approximations  convergent  to  the  solution, 
(if  one  exists),  it  does  not  allow  the  filtering  of  noise,  an 
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1 


estimate  such  as  n(e,c)  or  K(N)  of  the  number  of  iterations 
needed,  nor  is  it  easily  computed.  In  fact,  in  most  iterative 
schemes  a  basis  is  implicitly  used  to  represent  the  iterates; 
it  would  seem  best  then  to  directly  form  a  Galerkin  approximation 
using  this  basis. 
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4. 


ILLUSTRATIVE  NUMERICAL  CALCULATIONS 


To  illustrate  the  analysis  of  the  previous  sections,  a 
particular  basis  set  was  chosen  and  its  Galerkin  approximation 
studied  from  two  viewpoints:  (1)  analysis  of  the  matrix  D 
of  Eq.  3.5  and  (2)  solution  of  Eq.  2.6  using  Eq.  3.16  in  the 
presence  of  noise. 

The  basis  chosen  is  the  piecewise  constant  functions  over 
[-c,c].  Since  a  complex-valued  singular  value  decomposition 

A 

code  for  matrices  is  not  available,  g,  G  and  g  in  Sec.  2  are 
taken  to  be  real  even  functions.  Item  (1)  above  is  not  affected 
because  the  real  matrix  D+D  could  be  formed  and  decomposed  to 
recover  the  singular  values.  Although  the  singular  values  and 
singular  vectors  of  P  FP  split  into  even  and  odd  subsets,  the 

v  L/ 

subsets  display  the  same  distribution  so  this  restriction  does 
not  significantly  change  the  inversion  problem.  The  basic 
functions  v*  employed  are 

_  ,..y  _  /N  [2c  (k-1  )-cN  2ck-cN 

pk(a0  ~  (2 c)  ’  we[- — n - >  n — 

*  0  ,  elsewhere  .  (4.1) 

The  Galerkin  approximat ion  was  formed  as  discussed  in 

Sec.  3  by  choosing  K  =  L  =  N  and  ^  =  0^  =  p^.  The  functions 

P  FP  p  were  evaluated  analytically.  Consequently  the  vector 
c  c  n 

A 

Cj  =  (g,Pj)  was  produced,  where  7  point  Gauss  quadrature  was 


used  to  integrate  the  inner  products.  Prom  Secs.  2  and  3, 
the  quantities  of  interest  are  the  number  of  significantly 
nonzero  singular  values  of  PcFPc  for  a  fixed  c.  In  addition, 
we  wish  to  know  how  well  they  are  approximated  by  the  singular 
values  of  for  a  particular  N. 

The  singular  values  of  were  numerically  evaluated  on 
a  computer  for  various  N  and  c  values.  Table  1  is  a  summary 
of  the  results.  Recall  that  n(e,c)  indicates  the  number  of 
singular  values  of  PC-PPC  greater  than  e,  N(e,c)  indicates  the 
number  of  basis  functions  needed  to  insure  that  the  singular 
.a]ues  of  are  within  an  accuracy  e  of  those  of  PcFPc. 

Because  it  is  impractical  to  consider  every  value  of  N,  the 
table  gives  an  interval  containing  N(e,c)  rather  than  the 
exact  result.  The  table  indicates  that  the  piecewise  constant 
functions  are  by  no  means  an  optimal  basis  in  that  N(e,c)  is 
not  close  to  n(e,c).  Nevertheless,  it  is  adequate  for  large 
and  small  c.  The  piecewise  constant  functions,  however,  are 
a  very  good  choice  from  a  computational  viewpoint.  Furthermore, 
as  we  will  shortly  see,  it  appears  that  it  is  not  necessary 
for  the  basis  functions  to  be  good  approximations  to  the 
singular  values  in  order  to  perform  accurate  inversions. 

Two  functions  (with  obvious  optical  interpretations)  were 
chosen  for  numerical  experimentation.  They  are: 
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(4.2a) 


g(v)  = 


sln27Tv 

TTV 


=  2 


(sinirv  \2 

)  • 


The  corresponding  Fourier  transforms  are 


(4.3a) 


G  ( 00 )  =  1  ,  |w|  <  1 

=  0  ,  |  co  |  >  1  (4.2b) 

and 

G(to)  =  2  Cl-  |  to  |  )  ,  |  to  |  <  1 

=  0  i  co  |  >  1  (4.3b) 

In  order  to  make  the  numerical  results  specific,  we  gated  the 
g(v)  functions  to  the  finite  interval  |v|  <  1  (i.e.,  c  =  1  in 
Eq.  2.7). 

Approximations  to  the  G's  were  calculated  using  the 
Galerkin  approximation  generated  by  {p^}  and  Eq.  3.5.  The 
filter  of  Eq.  2.21  was  used  with  the  cutoff  taken  to  be  the 
greater  of  the  noise  level  5 (specified  below)  and  the  accuracy 
of  the  Galerkin  approximation  (taken  to  be  ~10-2  for  these 
problems).  Approximations  to  the  true  extrapolations  g(v) 
in  the  range  |v|  <  4,  were  found  from  the  exact  Fourier  trans¬ 
forms  of  the  calculated  G's,  i.e., 
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(4.10 


N 

%  =  FGN  =  l  anFpn  • 
n=l 

To  simulate  the  presence  of  noise  in  the  measurements  of 

✓N  A 

g(v),  whenever  a  value  of  g(v)  was  required  in  the  numerical 

integrations  needed  to  calculate  the  vector  c.  =  (g,p.),  a 

J  J 

random  perturbation  was  added.  That  is 

A  A  ITlcLX  A 

g6(v)  =  g(v)  +  <5y  vec  I g ( v ) I  (4.5) 

was  used  instead  of  g(v).  The  constant  6  denotes  the  noise 
level,  y  is  a  random  variable  uniformly  distributed  over  [-1,1]. 
The  effect  of  noise  on  the  inversions  and  extrapolations  was 
studied  by  comparing  with  the  noiseless  inversion/extrapolation 
as  the  standard.  In  all  cases  to  be  discussed,  the  noise 
level  was  3 %,  <5  =  0.03. 

Figure  1  shows  four  sample  realizations  of  G(o)),  cor- 

A 

responding  to  Eq.  4.2b,  as  calculated  from  the  noisy  g(v)  using 
a  basis  of  40  piecewise  constant  functions.  The  G(co)  calculated 
in  the  absence  of  noise  coincided  with  the  true  G(to)  to  within 
less  than  0.05$.  Four  reconstructions  of  G(w)  in  the  presence 
of  1%  noise  are  listed  in  Table  2,  and  have  about  a  5$  maximum 
deviation  from  the  true  value.  The  extrapolations  of  g(v)  for 
two  of  these  sample  realizations  in  the  presence  of  3%  noise 
are  displayed  in  Figs.  2  and  3  (see  open  circles),  with  the 
exact  g(v)  shown  as  the  solid  line. 
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Figures  4  and  5  show  the  behavior  of  G(w) ,  Eq.  4.3b, 

A 

as  calculated  from  four  sample  realizations  of  g(v);  this  time 
with  N  =  80.  As  before,  the  noiseless  calculations  coincide 
with  the  true  G(to)  to  well  within  graphical  accuracy.  However, 
the  extrapolation  is  not  as  well  behaved,  as  witness  Fig.  6. 

This  is  undoubtedly  due  to  the  fact  that  the  true  G(w)  has  a 
discontinuity  in  slope  at  cj  =  0 .  Failure  to  reproduce  the 
correct  slope  at  the  origin  propagates  an  error  throughout  the 
remainder  of  the  algorithm. 

The  extrapolations  of  the  corresponding  g(v)  for  three 
of  these  sample  realizations  (in  the  presence  of  3%  noise) 
are  show  in  Figs.  7  through  9  (see  open  circles)  with  the 
exact  g(v)  shown  as  the  solid  line.  We  feel  that  the  relatively 
poor  extrapolations  of  g(v)  in  this  case  are  due  to  most  of 
the  error  in  G(ui)  appearing  at  low  frequencies  which  are  then 
transformed  into  errors  at  the  tail  of  g(v). 

We  believe  that  a  choice  of  basic  functions  that  have  the 
same  degree  of  smoothness  as  the  unknown  G ( oj )  will  tend  to 
alleviate  this  problem.  Work  in  continuing  along  these  lines. 
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TABLE  1 


DISTRIBUTION  OF  SINGULAR  VALUES  OF  P  FP  AND  REQUIRED  BASIS  SIZE 
FOR  DESIRED  ACCURACY. 


c 

n( 10"2 ,c) 

N(10~2,c) 

n( 10‘3,c) 

N( 10" 3  ,c 

.5 

4 

<10 

5 

10-20 

1.0 

8 

30-40 

9 

60-80 

1.5 

13 

60-80 

15 

>80 

2.0 

21 

>80 

23 

>80 
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TABLE  2,  FOUR  SAMPLE  REALIZATIONS  OF  RECONSTRUCTIONS  WITH  1%  NOISE. 


01 

G(tu) 

G(«) 

G(oj) 

G(o>) 

-1 

.0000 

•  983 

.973 

1.099 

.907 

- 

.9474 

1.009 

1.010 

1.043 

1.005 

- 

.8947 

1.013 

1.021 

.995 

1.0  46 

- 

.  8421 

1.004 

1.017 

.962 

1.0  50 

- 

.7895 

.991 

1.007 

.944 

1.035 

- 

.7368 

.980 

.997 

.942 

1.013 

- 

.6842 

.974 

.990 

.952 

.993 

- 

.  6316 

.975 

CT\ 

OO 

.971 

.9  80 

- 

.5789 

.980 

.993 

.995 

.976 

- 

.5263 

.988 

1.000 

1.017 

.979 

- 

.  4737 

.999 

1.008 

1.035 

.988 

- 

.4211 

1.007 

1.015 

1.046 

.998 

- 

.3684 

1.013 

1.019 

1.048 

1.007 

- 

•  3158 

1.014 

1.019 

1.042 

1.012 

- 

.  2632 

1.011 

1.015 

1.028 

1.014 

- 

.2105 

1.005 

1.008 

1.008 

1.012 

- 

.1579 

.997 

.999 

.987 

1.007 

- 

.1053 

.988 

.990 

.968 

1.001 

- 

.0526 

.982 

.983 

.953 

.996 

0 

.978 

.979 

.945 

.993 
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Fir.  5 


Two  sample  realizations  (  — —  and  - ) 

tion  of  G ( oj  )  corresponding  to  Eq  .  H  .  3 
of  3%  noise  in  g(v). 


of  the  reconstruc 
in  the  presence 
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to  sample  realization  ( - )  in  Fig.  4. 


to  sample  realization  (  — — )  in  Fig.  5. 


SECTION  2 


INTERFEROGRAM  REDUCTION  FOR  ABERRATED  ANNULAR  APERTURES 
VIA  ZERNIKE-BARAKAT  FUNCTIONS, 

USING  L2  AND  Li  NORM  ALGORITHMS 


ABSTRACT 


Interferogram  reduction  of  aberrated  wavefront  data  for 
annular  apertures  in  a  hostile  environmental  is  discussed  using 
both  the  L2  and  L  norms. 
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1. 


INTRODUCTION 


The  plan  of  this  section  Is  two-fold.  First,  the  Zernike 
functions  appropriate  to  an  annular  aperture,  as  derived  by 
Barakat  [1],  are  briefly  summarized.  The  lower  order  functions 
are  plotted,  discussed,  and  compared  with  the  Zernike  -polynomials 
appropriate  to  an  annular  aperture.  It  is  concluded  that  the 
Zernike  functions  are  the  natural  generalization  of  the  usual 
Zernike  polynomials.  Second,  the  Zernike  functions  are  discussed 
in  the  context  of  interferogram  reduction  using  the  usual  Lz 
norm,  commonly  termed  least  squares.  However,  the  L norm  is 
not  robust  with  respect  to  "outliers"  in  the  raw  fringe  data  such 
as  occur  in  the  hostile  environments  in  which  the  reduction  must 
be  expected  to  be  performed.  The  Li  norm  is  known  to  be  robust 
with  respect  to  these  outliers,  and  the  linear  programming 
approach  to  solving  the  overdetermined  fringe  reduction  equations 
to  obtain  the  aberrated  wavefront  is  discussed. 


2. 


GENERALIZED  ZERNIKE  FUNCTIONS 


The  generalized  Zernike  functions  are  a  set  of  complete 
orthogonal  functions  defined  over  the  annular  aperture  that  were 
recently  introduced  by  the  author  [2] .  They  are  conveniently 
written  in  cylindrical  coordinates  as  products  of  angular 
functions  (i.e.,  trlgnometric  functions)  and  radial  functions. 

The  most  general  version  of  the  radial  functions  defined  over 
the  annular  aperture  also  contains  a  bell-shaped  amplitude 
taper;  however,  we  will  only  concern  ourselves  with  the  situation 
where  the  amplitude  is  constant  over  the  unobscured  portion 
of  the  annular  aperture.  The  radial  coordinate  of  the  annular 
aperture  is  denoted  by  p  with  reference  to  Eq.  2.9  of  [2]  with 
a  =  0,  the  radial  annular  functions  are 


n  1-e2 


(o,m) 

P 

(n-m)/2 


(2.1) 


where : 


e  =  obscuration  ratio  0  <  e  <  1 


N™(e)  =  normalization  factor 
(o,m) 

P  =  Jacobi  polynomial 

(n-m)/2 
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Properties  of  these  functions  have  been  obtained  by  Barakat  [2] 
and  we  refer  to  this  paper  for  details.  Perhaps  the  most 
important  property  is 


1 

Bn ( P  ’ e  ) Bn  '  ( p  * e  )  pdp  =  hn(e)(Snn’ 
e 

where 


( 1-c 2 ) 

2n  +  2 


(2.2) 


(2.3) 


independent  of  m.  The  first  few  B™  functions  are  listed  in 
Table  1. 


Plots  of  some  of  the  lower  order  Bm  functions  are  shown 

n 

in  Figs.  1  to  5  for  an  obscuration  ration  e  =  .3;  the  cor¬ 
responding  Zernike  polynomial  R™(p)  is  also  shown  for  comparison. 
In  Fig.  6  we  show  B*(p)  for  e  =  .5;  compare  with  Fig.  2.  Note 
that 


B™(p=e)  =  0  (2.4) 

just  as  the  regular  Zernike  polynomials  satisfy 

Rjj(p=  0)  =  0  (2.5) 
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In  fact,  an  examination  of  these  curves  shows  that  B‘  is  a 
scaled  version  of  R®  over  e  <_  p  £  1 .  This  property  is  not 
shared  by  the  generalized  Zernike  polynomials  introduced 
originally  by  Tatian  [ 2 ]  and  studied  more  intensively  by 
Mahajan  [3] •  It  is  not  our  intention  to  discuss  these  aspects 
of  the  problem  here,  as  we  intend  to  carry  out  a  detailed 
comparison  of  the  generalized  Zernike  functions  and  the 
generalized  Zernike  polynomials  in  the  near  future.  Never¬ 
theless  we  have  shown  in  Fig.  6  as  the  line  the  corresponding 

polynomial  as  explicitly  evaluated  by  Mahajan,  which  in  our 
notation  reads 


poly 


=  3(l+e2)p3  -  2(l+e2  +  e1*) 
(l+£2-2e4) 


(2.6) 


Note  the  discontinuous  behavior  of  the  polynomial  at  the  edge 
of  the  obscured  aperture  p  =  .5. 

The  generalized  Zernike  functions  for  the  annular  aperture 

are : 

cos  m 

5”(p,0)  =  B®(p)  (2.7) 

1  sin  m 

which  we  abbreviate  by  letting  n  denote  p,0  so  that 
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We  omit  the  e  dependence  for  typographic  convenience.  In  the 
inversion  scheme  to  be  discussed  in  the  next  section,  it  is 
convenient  to  employ  lexographic  ordering  of  the  generalized 
Zernike  functions.  They  follow  naturally  into  two  classes: 
optical  systems  possessing  rotationally  symmetric  aberrations, 
and  optical  systems  lacking  any  symmetry.  The  latter  are  the 
more  important  in  the  context  of  the  problems  that  we  encounter 
in  the  deconvolution  scheme  of  things.  Thus  we  now  write 
c, .  C n )  to  denote  c,  (p)  and  refer  to  Tables  2  and  3  for  the 
explicit  lexographic  ordering  employed  in  the  present  communica¬ 
tion.  There  is  nothing  unique  about  the  ordering  and  one  can 
permute  the  ordering  if  desired.  The  wavefront  W(rj)  for  the 
case  of  an  optical  system  lacking  any  symmetry  at  a  fixed  object 
point  is 

oo  00 

w(n )  =  V  y  [c  cos  ra8  +  s  sin  m0]  Bm(p)  (2.9) 

n=0  m=0  mn 

where  cmn  and  smn  depend  upon  the  object  point.  The  restrictions 

on  n  and  m  are  that  n  >_  m  and  that  (n-m)  is  an  even  integer. 

In  the  case  of  an  optical  system  possessing  rotational  symmetry, 

s  vanishes  identically.  In  Table  2  only  the  cos  me  terms 
mn 

can  appear.  In  Table  3  both  cos  m8  and  sin  me  can  appear  and 
the  lower  case  s  and  c  indicate  this.  For  example,  j  =  14  and 
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15  both  refer  to  m  =  n  =  4 ,  but  j  =  14  is  associated  with 
cos  40  while  j  =  15  is  associated  with  sin  40  in  accordance 
with  Eq.  2.9.  We  have  listed  22  individual  aberration  terms. 
Although  we  can  continue  to  list  higher  order  terms  such 
terms  oscillate  very  rapidly  and  are  of  limited  use  in  the 
diagnostic  aspects  we  shall  be  studying. 

When  lexographic  ordering  is  employed,  then  Eq.  2.9  can  be 
rewritten  in  the  form 

00 

w(n)  =  l  z  c  (n)  (2.10) 

j=l  J  J 

where  the  expansion  coefficients  Zj  are  equivalent  to  the  c  and 
s  coefficients  in  Eq.  2.9. 
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3. 


L,  and  L2  NORM  INVERSION  SOLUTIONS 

The  determination  of  the  aberrated  wavefront  of  an  optical 
system  is  invariably  carried  out  by  inverting  the  fringe  systems 
obtained  from  interferograms .  It  is  generally  assumed  that 
the  number  of  fringe  measurements  is  greater  than  the  number 
of  expansion  terms  in  the  hypothesized  wavefront  expansion. 

In  terms  of  our  problem,  the  discretized  version  of 
Eq.  2.10  becomes  (with  j-'-n) 

*  T 

l  2  t  (nr)  =  W(nr)  ,  m  =  1,  2,  ...,  M  (3.1) 

n=l  n 

’.■/here  W  =  W(n  )  is  the  noisy  measured  wavefront  from  the 

m  m 

interferometer  at  the  points  m  =  1,  ...,  M  with  M  N.  The 
unknown  expansion  coefficients  z are  to  be  determined. 

We  can  write  Eq.  3-1  which  is  an  overdetermined,  but 
possibly  rank  deficient,  system  of  linear  equations  in  matrix 
from 

qz  =  Q  .  (3.2) 

Here  C  is  M  *  M  (M  rows  and  N  columns),  z  the  vector  of  unknowns 
is  N  *  1,  and  W  the  vector  of  noisy  measurements  is  M  *  1. 

Our  basic  problem  is  to  "invert"  Eq .  3-2  in  order  to 
determine  z  in  terms  of  W  and  t, .  In  other  words,  find 


2-8 


I 


the  vector  of  parameters  z  each  that  Eqs.  3.1  or  3.2 
fits  the  data.  The  meaning  of  the  phrase  "best  fits 
can  be  made  more  precise  by  defining  the  residual  (= 
term 


best 

the  data" 
error ) 


A 

=  w  - 


A  A 

CZ 


(3.3) 


The  most  common  definition  for  best  fit  is  the  L2  norm  criterion, 
often  called  the  least  squares  criterion  for  which  z  is  chosen 
such  that 


£  €2  =  minimum  (3.^0 

n 

This  procedure  dates  back  to  Gauss  and  Legendre  and  has  been 
of  invaluable  use  almost  since  its  inception.  The  standard 
algorithm  amounts  to  solving  the  square  system 

A  J-  A  A  A  J>  A 

C-;z  =  c  w  (3.5) 

A  A 

where  c,  £  =  normal  equations.  Straightforward  application  of 
of  this  algorithm  often  involves  numerical  instabilities  that 
render  the  "solution"  useless  such  as  in  lens  design  where 
the  matrix  equivalent  of  £  is  nearby  singular.  Fortunately 
the  condition  number  of  (£+£)  defined  as  [4] 

r  =  largest  singular  value  /, 

smallest  singular  value 
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is  not  too  large.  If  C  is  numerically  small,  a  small  relative 
change  in  £  cannot  produce  a  very  large  relative  change  in  z 
and  thus  the  normal  equations  are  well  conditioned.  If  C  has 
a  very  small  value,  then  small  relative  changes  in  £  can  produce 
large  relative  changes  in  z  and  the  normal  equations  are  poorly 
conditioned.  When  the  normal  equations  are  poorly  conditioned  the 

A 

estimate  of  z  is  a  highly  unstable  function  of  the  observed 
vector  W.  Two  slightly  different  value  of  W  will  likely  pro¬ 
duce  vastly  different  values  of  z.  Fortunately  for  us  the 
normal  equation  is  relatively  well  conditioned  with  large 
diagonal  elements  and  very  small  off  diagonal  elements.  This 
fact  follows  almost  directly  from  the  orthogonality  condition 
Eq.  2.2. 

Numerical  calculations  in  the  L2  norm  were  carried  out  by 
using  the  method  of  singular  value  decomposition  to  evaluate 
the  psuedoinverse .  For  those  readers  who  are  not  familiar  with 
the  algorithm,  reference  is  made  to  the  standard  reference  [5] 
or  to  Sec.  3  of  the  interim  report  RADC-TR-80-151*  for 
details.  In  the  singular  value  decomposition  approach,  one 
completely  avoids  the  normal  matrix  and  deals  directly  with 
Eq .  3-2  with  generally  more  accuracy  than  that  achieved  by 
working  with  Eq .  3.5.  Some  simulated  calculations  based  upon 
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synthetic  data  have  been  run  and  the  least  square/singular  value 
decomposition  works  extremely  well,  but  this  is  not  really  a 
good  test  for  reasons  to  be  discussed. 


The  first  point  concerns  the  model  itself.  Now  the  assump¬ 
tion  that  the  actual  wavefront  can  be  fitted  by  the  expansion, 

Eq .  3-lj  is  a  basic  one.  In  actuality  the  pupil  function  of 
the  optical  system  consists  of  two  terms 


A(p,q) 


A„(p,<l>elltW(P.'>) 


(3.7) 


where 


A0(p,q)  =  amplitude  distribution  over  the  exit  pupil 

W(p,q)  =  phase  distribution  (wavefront  aberration  function 
measured  in  wavelength  units)  over  the  exit  pupil. 

The  rectangular  pupil  coordinates  p,q  are  directly  related  to 
the  cylindrical  pupil  coordinates  p,0.  If  the  amplitude  dis¬ 
tribution  over  the  exit  pupil  is  constant  than  the  optical 
system  is  termed  an  Airy  system.  This  is  the  situation  we 

assume  to  be  true  when  writing  down  Eq.  3.1.  Now  if  A0(p,q) 
varies  over  the  exit  pupil  in  any  substantial  manner,  we  have 
no  guarantee  that  the  expansion  is  a  valid  representation  of 
the  diffraction  performance  of  the  optical  system  in  question, 
even  though  the  curve  fit  of  W  is  acceptable.  The  diffraction 
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performance  of  an  optical  system  is  governed  by  both  A0  and  W; 
only  when  A0  is  a  constant  does  the  determination  of  W  alone 
become  meaningful.  Small  perturbations  of  A  (p,q)  from  a 
constant  value  hopefully  have  small  effects. 

Least  squares  and  other  inversion  techniques  have  been 
used  for  many  years  to  derive  best  fit  models  to  sets  of  optical 
interferometry  data;  see  Malacara's  recent  text  [5]  for  a 
representative  discussion  as  well  as  older  papers  by  the 
author  [7-10].  It  is  common  practice  to  test  the  inversion 
techniques  with  synthetic  data  and  to  add  random  noise  to  the 
perfect  data.  The  addition  of  random  noise  will  test  the 
performance  of  the  jnversion  method  in  the  presence  of  noise, 
but  it  is  difficult  to  determine  the  influence  of  the  noise 
on  the  inverted  model  parameters.  This  is  because  additive 
noise  is  unrealistic.  Although  it  is  true  that  many  forms 
of  noise  encountered  in  optical  Interferometry  are  additive 
in  some  sense  much  of  this  can  be  removed  by  careful  instru¬ 
mentation  (and  by  averaging  of  repeated  measurements  in  some 
cases).  Noise  in  the  measured  wavefront  data  consists  of  two 
types,  regardless  of  how  the  noise  was  generated.  The  first 
type  is  that  which  cannot  be  fitted  to  any  scalar  diffraction 
model  of  the  wavefront  —  amplitude  over  the  exit  pupil.  For 
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example,  if  polarization-inducing  effects  were  present  they 
could  not  be  accounted  for  by  scalar  calculations  in  any 
simple  way,  although  there  are  ways  of  handling  some  aspects  by 
a  psuedo-scalar  approach  [11,12].  The  second  type  of  noise  is 
that  which,  although  it  may  be  unrelated  to  the  actual  physics, 
can  be  fitted  to  a  scalar  diffraction  model.  The  extent  to 
which  the  expansion  parameters  in  Eq.  3>1  are  modulated  by  this 
noise  is  the  determining  factor  as  to  whether  least  square 
formulation  is  appropriate.  This  brings  us  to  another  question. 

This  question  concerns  the  quality  of  the  measured  noisy 
data;  the  presence  of  residuals  (or  outliers)  of  quite  unequal 
sizes  calls  into  question  the  justification  for  choosing  the 
fitting  to  be  done  by  the  least  s'quares  criterion.  In  fact,  it 
is  well  known  among  statisticians  (and  virtually  unknown  to 
everyone  else!)  that  least  square  criteria  are  not  very 
"robust"  with  respect  to  outliers  in  the  wavefront  fringe  data. 
By  "robust"  one  means  that  a  technique  has  nearly  optimal 
properties  under  the  standard  assumptions  and  yet  remains  a 
good  method  when  various  failures  of  the  assumptions  occur. 
Unfortunatley  wavefront  interferometry  is  going  to  be  carried 
out  in  a  hostile  environment  as  regards  the  active  optics  that 
RADC  programs  contemplate  and  consequently  there  will  occur 
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large  outliers  in  the  wavefront  fringe  data  over  which  we  will 
have  virtually  no  control!  So  that  even  if  we  can  control  the 
occurrence  of  large  outliers  in  a  labroatory  environment  and 
thus  guarantee  the  usefulness  of  the  L2  norm  criterion;  this 
happy  state  of  affairs  is  not  the  one  we  have  to  deal  with  in 
using  wavefront  interferometry  for  deconvolution  because  the 
L2  norm  criterion  is  too  far  from  the  desired  robust  estimate. 

Another  possibility  is  the  Lj  criterion  (also  termed  the 
least  absolute  deviation)  to  minimize 


This  expression  is  called  the  L  norm  of  c  and  the  use  of  this 
norm  is  relatively  recent  although  theoretical  aspects  have 
been  known  for  a  long  time  [23].  A  valuable  property  of  the 
Lj  norm  is  that  it  is  robust 

A  practical  method  of  solving  this  type  of  problem  is  via 
the  techniques  of  linear  programming.  Unfortunately  there  are 
no  explicit  solutions  as  in  the  L2  norm.  The  linear  programming 
formulation  of  the  Lt  norm  problem  is: 

/N  A 

a.  given  C  and  w 

/V  A 

b.  find:  z  and  the  M  x  1  vector  e 
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which  minimize 


si<g 

subject  to 


(3.9) 


;z  +  e  =  W  (3.10) 

The  problem  can  be  solved  in  the  following  fashion.  First 

A  "f  A  — 

define  the  two  vectors  e  and  e:  (both  of  which  are  M  x  1) 
and  the  two  vectors  z  and  z  as  the  positive  and  negative 

A  A 

parts  of  e  and  z  respectively.  Note  that  in  this  subsection 
the  plus  sign  does  not  imply  that  the  transpose  be  taken. 

Given  these  remarks  we  reformulate  the  above  problem  as: 
minimize 


I  e+  +  l 


subject  to 


*  f  *+  /.  —  •.  ~  ^  —  r. 

C(z  -z  )  +  e  -  e  =  W 


(3.H) 


(3.12) 


where 

e  +  ,  z~  >_,  0,  z+,  z~  >_,  0  (3-13) 

An  efficient  numerical  program  [14]  is  available  for  these  cal¬ 
culations.  We  hope  to  run  some  realistic  simulations  in  the 
future . 
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TABLE  1 


LOWER  ORDER  B®(p,e) 

B°  =  1 
0 

b;  «  (l-e2r^(p2-e2)l/2 

B°  =  ( 1-  E  2  ) ~ 1  1 2p  2— e 2 — 1  ] 

2 

B2  =  (l-e2)-1  (p2-e2 ) 

Bj  =  (l-e2)"3/2  (p2^2)1''2  [3P2-g2-2] 

B3  =  (l-e2)-3  2  {p2-z2)^2 

B®  =  (l-e2)-2  [3p'*-6(l+e2 )p2+  e4  +  Le2  +  l] 
B2  =  ( 1- e 2 ) ~ 2  (P2  ~,2)  [Lp2-E2-3] 

B4  =  (l-e2)-2  (p2-E2)2 


2-18 


•i 

i 

I 

TABLE  3.  LEXOGRAPHIC  ORDERING  OF  THE  GENERALIZED  ZERNIKE  FUNCTIONS  FOR  ] 

ANNULAR  APERTURES  FOR  OPTICAL  SYSTEMS  LACKING  ANY  SYMMETRY  IN 

THE  WAVEFRONT.  I 


j 

(m,n) 

J 

(m,n) 

1 

(0,0) 

12 

(2, Me 

2 

(l,l)c 

13 

(2, Ms 

3 

(l,l)s 

ll 

(L,L)c 

L 

(0,2) 

15 

(MMs 

5 

(2,2)c 

16 

(l,s)c 

6 

(2,2)s 

17 

(l,s)c 

7 

(l,3)s 

18 

(3,5)c 

8 

(l,3)s 

19 

(3,5)a 

9 

(3,3)c 

20 

(s,s)c 

10 

(3,3)s 

21 

(s,s)s 

11 

(0,U) 

22 

(0,6) 

SECTION  3 


UPPER  AND  LOWER  BOUNDS  ON  RADIALLY  SYMMETRIC 
OPTICAL  TRANSFER  FUNCTIONS 

ABSTRACT 

For  an  optical  system  under  incoherent  imaging  situations 
the  conditions  of  finite  aperture  and  nonnegative  point  spread 
function  place  restrictive  bounds  on  the  values  of  the  correspon- 
ing  optical  transfer  function.  Under  the  extra  condition  that 
the  transfer  function  be  radially  symmetric,  these  bounds  can  be 
characterized  as  the  eigenvalues  of  a  real  symmetric  integral 
operator.  In  this  paper  the  bounds  are  calculated  first  by 
analytic  inequalities  that  bound  the  operator  norm,  then  by 
finding  the  eigenvalues  of  a  Galerk,in  approximation  to  the 
operator.  In  contrast  to  the  slit  aperture  case,  the  least  upper 
bound  and  greatest  lower  bound  are  found  to  differ.  The  pupil 
functions  which  achieve  these  bounds  at  a  given  spatial  frequency 
are  determined  along  with  the  associated  transfer  functions. 


1.  INTRODUCTION 


For  incoherent  imaging  situations,  the  point  spread  function 

t (v  ,v  )  is  nonnegative.  Its  Fourier  transform,  the  unnormalized 
x  y 

optical  transfer  function  R(a  ,a  ) 

x y  y 


R(a 

x 


00 

t(v  ,v  )exp(ia  v  +  ia  v  )da  da 
x*  y  ^  xx  y  y  x  y 

—00 


(1.1) 


has  compact  support  in  the  a  ,a  plane  that  is  twice  the  size  of 

x  y 

the  aperture  of  the  optical  system.  This  follows  directly  from 

the  fact  that  R(a  ,a  )  is  also  the  two-dimensional  convolution 

x  y 

of  the  pupil  function  of  the  optical  system  [1]. 

These  two  requirements  place  constraints  on  the  behavior 

of  R(a  ,a  ).  The  first,  that  R(a  ,a  )  be  a  positive  definite 
x  y  x  y 

function  (i.e.,  have  a  nonnegative  Fourier  transform)  gives  the 
well  known  results 

| R(a  ,a  ) |  <  R( 0 , 0 )  (1.2) 

A  «y 

R(ax,ay)  =  R*(-ax,-ay)  (1.3) 

The  second,  that  R(ax,ay)  has  bounded  support,  enables  calcula¬ 
tion  of  a  function  A+(ax,ay)  such  that 

|R(ax,ay)|  <  A+(ax,ay)R(0,0)  (1.4) 
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The  first  such  calculation  of  X+  was  done  by  Boas  and  Kac  [2,3] 
where  they  derive  the  least  upper  bound  for  the  one-dimensional 
problem  of  a  slit  aperture  on  [-1,+1] 

X+(a  )  =  cos  - —  ,  |a  [  <  2  (1.5) 

[2/i“x3+1 

whe  re 

[2/| a  |]  =  least  interger  not  less  than  2/ja  j  (1.6) 

and  describe  functions  that  achieve  this  bound.  Although  their 
papers  do  not  mention  any  physical  applications.  Professor  Kac 
in  a  private  conversation  with  one  of  us  (R.B.)  remarked  that 
the  analysis  was  done  in  connection  with  radar  antenna  design 
at  the  MIT  Radiation  Laboratory  during  World  War  II.  This  bound 
was  independently  derived  by  Lukosz  [4,5]  using  linear  system 
theory. 

A  similar  result  holds  in  higher  dimensions.  Suppose  a 
bounded  aperture  A  satisfies  the  condition  that  if 

P  =  (Pj ,P2,-‘-,Pn)eA  (1.7) 

then  the  aperture  contains  the  n-dimensional  rectangle  whose 
vertices  are  the  points 

q  =  (+P! ,+P2,’“ >±Pn)  (1.3) 
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It  follows  that 


| R(a ) {  <  A+(S)R(0)  (1.9) 

where 

A+(a)  =  cos  ,  a  >  0  (1.10) 

and  (2/a)a  lies  on  the  boundary  of  A.  The  proof  is  too  lengthy 
for  full  reproduction*  but  its  outline  is  as  follows.  First, 
the  result  is  shown  for  n-dimensional  rectangles  centered  at  the 
origin.  Next,  a  sequence  of  these  rectangles  is  imbedded  inside 
A  such  that  the  rectangles  are  centered  on  the  origin,  have  their 

/V  A 

major  axis  in  the  direction  a,  and  contain  the  point  (2/o)a  in 
their  limit.  Given  this  construction,  it  is  possible  to  show  that 
a  least  upper  bound  for  such  a  sequence  must  be  a  least  upper 
bound  for  all  pupil  functions  over  A,  and  that  the  least  upper 
bound  for  the  sequence  is  Eq.  1.9- 

In  practice,  additional  constraints  can  be  imposed  on  R(a); 
in  particular  by  requiring  symmetries  in  the  point  spread 
function  t(v).  In  reference  [2]  it  is  shown  that  the  require¬ 
ment  that  t(v)  be  even  does  not  lower  the  bounds  in  Eq .  1.5. 

Lukosz  [5J,  shows  that  on  a  square  aperture  the  requirement  of 
symmetry  on  the  pupil  function  A(p,q)  (where  p,q  represent  the 
exit  pupil  coordinates  of  the  aperture)  in  either  the  p  or  q 
axis,  or  in  both,  leads  to  the  same  improvement 
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,+  ,  U  TT 

A  C  CX  ,Cl  )  ^  COS  r  ~  /  I - 11,-1  COS  TV  /~| - 1  1  7  -I 

x*  y'  -  L  2/ 1  a  ]  J+l  [2/|a  |]+1 

A  J 

where  la  I  , 1  a  I  <  2 . 

i  xi  » i  y  - 

In  this  paper  the  case  of  a  circular  aperture  with  radially 
symmetric  pupil  function 

A(p,q)  H  A(p)  ,  p  =  (p2+q2 )^  ,  o  £p<l  (1.12) 

is  considered.  Obviously  the  unnormalized  transfer  function  now 
becomes  '  j 

! 

R(a  ,a  )  =  R(w),  w  =  (a2+a2)2  ,  (1.13) 

x  y  x  y 

Frieden  [6]  gives  some  numerical  estimates  of  A+  for  this  case; 
we  supplement  this  work  with  some  analytical  bounds .  Furthermore 
using  a  different  numerical  procedure  outlined  in  Section  h ,  we 
demonstrate  that  in  this  case  the  greatest  lower  bounds  and  least 
upper  bounds  are  different .  This  contrasts  with  the  other  cases 

|  /V  A 

discussed  above,  where  it  was  shown  that  functions  R  (a),R  (a) 
could  be  found  such  that  [2] 

R+ (a )  =  A+(a)R(0) 

R~(a)  =  -A+(a)R(0) 


(1.14) 


Radial  symmetry  implies  that  t(v)  is  even  as  well  as  real 
so  both  R(co  )  and  A(p)  are  real.  This  symmetry  and  Eq.  2.6  show 
that  the  ratio  R(w)/R(0)  is  the  Rayleigh  quotient  [7]  of  a  real 
symmetric  operator.  Thus  the  extrema  A*  and  A~  of  the  quotient, 
where 


A“  <  <  A  + 

a)  —  R  (0  )  —  u) 


(1.15) 


have  the  alternative  characterization  of  being  eigenvalues  of  a 
symmetric  operator  (see  Eq.  2.9).  We  use  this  property  in  esti¬ 
mating  A*  and  A^.  The  radial  symmetry  situation  is  intrinsically 
different  from  the  previous  cases  because  the  transfer  function 
must  now  satisfy  a  symmetric  integral  equation  of  the  second  kind. 
In  contrast,  the  equation  similar  to  Eq.  2.9  derived  for  a  slit 
aperture  in  [2]  by  the  calculus  of  variations  is  a  continuous 
algebraic  equation 

2AA(p)  =  A(a-p ) +A(a+p )  ,  |pj  <  1 

=0  ,  | p |  >  1  (1.16) 
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2.  PRELIMINARIES 

Let  A(p)  denote  the  radially  symmetric  real-valued  pupil 
function  over  the  circular  aperture.  The  unnormalized  transfer 
function  R(co)  can  be  written  in  the  form  [6,8] 
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Note  that 


R( 0 )  =  2tt|  |  A|  |  2  (2.7) 

Since  the  two  dimensional  convolution  of  two  radially 
symmetric  functions  over  the  circular  aperture  of  unit  radius 
is  also  symmetric,  then  is  a  self-adjoint  operator  with 
respect  to  the  defined  inner  product.  This  and  Eqs.  2.1,  2.7 
imply  that  R(w)/R(0)  will  be  maximized  or  minimized  if  and  only 
if  A(p)  is  an  eigenfunction  of  K  corresponding  to  the  maximum 

(i) 

..or  minimum  eigenvalues  X  +  and  X~,  see  reference  [7].  Thus  we 

w  oj 

solve  the  homogeneous  integral  equation 

27T 

A(q)d0  =  AwA(p)  (2.8) 

o 

or  equivalently 

( K  A) ( p )  =  X  A(p)  (2.9) 

OJ  0) 

for  the  eigenvalues  X*  and  X“,  along  with  their  corresponding 
eigenfunctions  A(p). 

In  the  following  sections,  various  methods  for  finding 
X+  and  X-  are  outlined,  along  with  comments  on  behavior,  per- 
formance ,  etc.  of  the  corresponding  eigenfunctions  and  transfer 
functions . 
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3. 


ANALYTIC  BOUNDS 


In  this  section  analytic  methods  are  used  to  bound  the 
modulus  of  the  largest  eigenvalue  of  using  a  series  of 
integral  inequalities  suggested  by  [9].  K  can  be  written 

U) 

in  the  normal  (symmetrized)  form 

i 

(KuA)(p)  =  4 1  k(oj,p,t)tA(t)dt  (3.1) 

0 

where 

k(oj,p,t)  =  [4p2t2-(co2-p2-t2)2  ]“'%  |o)_p|  <  t  <  w+p 

=  0  ,  elsewhere  (3.2) 

Unfortunately 

i 

|  |  tk(o>,  p  ,t )  |  2dt  =  00  (3-3) 

o 

so  that  tk(aj,p,t)is  not  an  L2  kernel;  consequently  a  quick 
bound  on  the  eigenvalues  cannot  be  established. 

Nevertheless,  we  can  proceed  in  the  following  fashion  by 
first  defining: 

i 

r 

kjtiOjp)  =  |  k(co,p,t  )A(t  )2tdt  |  (3.11) 

o 

f  1 

k2(o>,p)  =  |  k(w,p,t  )2t|  dt  (3-5) 

o 
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(3.6) 


c,  (ci> ,  t )  =  |k(w,p)|k(o),p,t)j2pdp 


S  0<t<l 


Since 


R  (  <d  )  =  2  A(p)pdp 


r 


k(co , p  ,  t  )A( t )  2tdt 


by  virtue  of  Eq .  2.1,  then 


R(w)|  < 


r  i 


A2  ( p ) pdp 


k2  (w,p)pdp 


by  Schwarz's  inequality.  Furthermore  the  modulus  of  kt  (co 
be  similarly  bounded 


I  kj  (<u,p)  |  < 


i 


|  k  (co ,  p  ,  t )  2t  j  dt 


k(co  ,p  ,t  )A2  ( t '!  2t  Jdt 


Lo 


These  inequalities  imply  that 


i  i  i 

j  k2(u,p)pdp  <_  j  k2  (oj,p)pdp|  k(o),p. 


t ) A2 (t ) 2tdt 


<_  j  A2(t)tdtj  k(co,p,t)k2  (w,p)2pdp 
o  o 


max 


-  0<t <1  3 


k  (co  ,t)  |  |  A|  | 


(3.7) 

(3-8) 

(3-9) 

, p)  can 

(3.10) 


(3.11) 
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Upon  using  Eqs .  2.7  and  3-7,  we  finally  obtain 


|R(«)|  <  Ck„  (to)]55  R(0)/n  (3.12) 

The  function  [k  (co)  ]  2/tt  was  evaluated  numerically  and  appears  in 
the  first  column  of  Table  1. 


A  useful  result,  found  by  further  use  of  inequalities  on 
the  definition  of  k  (co),  is 


I  x J  i  |  [k4 (co) ^  arccos  |  (3.13) 

for  /2  <_  co  <2.  Furthermore,  Eq.  3-13  is  the  asymptotically 

optimal  bound  in  the  sense  that  a  sequence  of  pupil  functions 

A  *  (p)  can  be  constructed  that  generate  extremal  transfer 

functions  R  »  (oj),  such  that  as  u1  +  2  then 
to 


1 

2 


arccos  (co  *  /2  ) 

IT 


<  r0l)  i  ^ M  u  ^  arccos  (co 1  /2 ) 

“  R^,  (co=0 )  _  77 


(3.14) 


where 

A^,(p)  =  0  ,  0  £  p  <_  co'/2 

=  1  ,  co’/2  <  p  <  1  (3.15) 

Numerical  calculations  suggest  that  the  greatest  lower  bound 
constant  in  Eq .  3-14  is  not  0.5  but  0.73-  These  bounds  comple¬ 
ment  the  bounds  found  by  Macdonald  [8]  for  c o'  -*•  0. 
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4. 


NUMERICAL  METHODS 


The  second  approach  tried  was  a  direct  calculation  of  A* 
and  A  ~  by  appropriate  discretization  of  the  integral  equation, 

Eq.  2.9.  A  finite  dimensional  Galerkin  approximation  scheme  was 

A 

used  to  form  a  finite  dimensional  operator  approximation  C  to  K  . 

O)  0) 

We  first  express  the  pupil  function  as  the  finite  expansion 
N 

A  ( p )  =  l  an<fn(p)  ,  0  <_  p  <  1  (4.1) 

n=l 


The  basis  functions  are 


(p)  = 


pn+l  pn 


2  xn(p) 


(4.2) 


whe  re 


xn(P)  =  1  if  p£Cpn,Pn+1] 
=  0  if  pfc[pn,pn+1] 


(4.3) 


and  Pn(n  =  1,2,***,N+1)  is  the  mesh  imposed  on  [0,1].  The  <J>n(p) 
are  orthonormal 


(<p  ,<p  )  =  <S 
rn,Tm  nm 


(4.4) 


Substitution  of  Eq.  4.1  into  Eq.  2.9  yields 
N  N 

£  an(K<A)(p)  =  X  E  Vn(p) 

n=l  n=l 


(4.5) 
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Direct  use  of  the  orthogonality  relation  then  gives 


y  ( <p  ,  K  $  )  a  =  Xa 
m’  co  n  n  n 

n=l 


(4.6) 


where  m  =  1,2,* ••>N.  We  can  rewrite  this  in  matrix  form 


C  A  =  y  A 

CO  CO  to  <0 


(4.7) 


where  C  is  a  symmetric  matrix  since  (<J>  ,K  <{>  )  =  (<J>  ,K  d>  ).  The 
co  J  nr  co  n  nJ  wTm 

A 

elements  of  C  could  be  exactly  evaluated  since  the  <J> 
co  m 

are  piecewise  constant  functions.  We  use  y  to  denote  the  eigen- 

A 

value  of  C to  distinguish  them  from  the  eigenvalues  X  of  K^. 

The  Rayleigh  quotient  corresponding  to  the  symmetric  matrix 
is  [7,10] 


A  A  A  A  A 


T ( co )  =  (A  ,C  A  ) / ( A  , A  ) 
co’coco  co’co 


(4.8) 


Examination  of  Eqs.  2.6  and  4.8  shows  that  the  Rayleigh  quotient 
can  also  be  written  as 


m  (  .  \  _  ,  R(  H  ) 
T(co)  -  pjoy 


(4.9) 


According  to  a  theorem  in  matrix  theory  [7,10] 


^  I  T(co)  < 


(4.10) 


where  is  the  largest  positive  eigenvalue  of  and  y  the 
largest  negative  eigenvalue  and 
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y 


+  _  max 
a)  "  A  /Q 


T(co)  >  0 


y 


—  _  min 
w  ~  A  7*0 


T(co)  <  0 


(A. 11) 


Since  is  symmetric  all  eigenvalues  are  real,  but  their  values 
must  be  determined  numerically. 


The  Rayleigh  quotient  was  not  calculated  directly  from  Eq . 

A 

A. 8.  Instead  a  singular  value  decomposition  of  into  the  form 


C  =  UEV+  (4.12) 

co 

/\  /V  A 

was  performed,  where  U,V  are  orthogonal  matrices  and  £  is  a 

diagonal  matrix  whose  entries  consist  of  the  absolute  values  of 

/\ 

the  eigenvalues  of  C^.  The  theory  is  detailed  in  [11],  a  stable 
numerical  algorithm  due  to  Golub  and  Reinsch  [12]  was  used  for 

A 

the  actual  calculations.  The  signs  of  elements  in  E  were  re- 

A  A  A 

covered  from  the  signs  of  columns  of  U  and  V.  Since  C  is  sym- 

CO 

A  A 

metric  U  and  V  are  identical  up  to  signs  on  their  columns.  These 
in  turn  determine  the  signs  of  the  eigenvalues. 


The  eigenvectors  associated  with  y  +  and  y  can  now  be  read 

0)  co 

A  A  A 

off  from  the  columns  of  U  or  V  and  give  an  approximation  A^  to 
the  pupil  function  A  (p)  that  will  maximize  (or  minimize) 
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R(<o)/R(0).  Calculation  of  the  transfer  function  associated 
with  A  (p)  is  effected  by  numerically  evaluating  the  inner 
product,  Eq.  4.8,  using  the  fixed  vector  A  ,  but  varying  the 

A 

matrix  corresponding  to  a  variation  in  its  subscript  co  over 
the  interval  (0<ox2). 


5. 


NUMERICAL  RESULTS 


Using  the  discretization  scheme  of  section  A,  u+  and  u 

CO  OJ 

were  calculated  using  M  =  40  in  Eq .  4.1.  The  results  are  listed 
in  Table  1  and  appear  in  graphical  form  in  Pig.  1.  There  are 
four  sources  of  error  in  p*  and  p~  as  approximations  to  A  +  and 
A^;  we  discuss  them  briefly.  Round-off  errors  are  negligible 
since  all  calculations  were  done  in  double  precision  and  the 
matrices  manipulated  had  relatively  small  dimension.  The  choice 

A 

of  basis  functions  gave  closed  form  expressions  for  removing 
the  need  to  numerically  estimate  the  approximate  operator  with 
attendant  errors  introduced.  Various  checks  run  on  the  singular 
value  decomposition  suggested  at  least  8  figure  accuracy  in  the 
decomposition,  so  that  error  in  the  decomposition  could  be  dis¬ 
regarded  . 

A 

The  main  source  of  error  occurs  in  approximating  by  C^. 
Calculations  indicated  that  the  error  | | K  -C^ | |  was  proportional 
to  1/N.  However,  is  a  compact  operator  so  it  has  an  infinite 

A 

sequence  of  eigenvalues  converging  on  zero,  whereas  being  finite 
dimensional  has  at  most  N  distinct  nonzero  eigenvalues.  There- 

A 

fore,  the  relative  error  of  an  eigenvalue  p  of  C  as  an  approxi- 

0) 

mation  to  an  eigenvalue  A  of  depends  on  N  and  the  ranking  of 
A  among  the  eigenvalues  of  arranged  in  absolute  value.  For 
all  oj,  A*  has  the  largest  modulus,  so  is  rapidly  approximated 


by  y*.  For  .  5<w<2,  A^  has  the  second  largest  modulus  and  is  also 
well  approximated  by  y  .  The  relative  error  in  both  these  cases 
for  N  =  40  was  estimated  as  less  than  0.1%.  As  u  +  0,  it  was 
observed  that  A  -*-0  so  that  A^  had  a  low  ranking  by  absolute  value. 
Thus  to  achieve  a  required  relative  error  in  y~  larger  N  values 
would  be  necessary.  Nevertheless,  for  .l<w<_.5,  N  =  40  still  gives 
an  estimated  error  of  less  than  1% . 

In  Figs.  2-5j  the  eigenfunctions  A+(p)  and  A~(p)  are  plotted 
for  some  representative  values  of  oi.  The  associated  normalized 
transfer  functions  T(oi)  =  R(oi)/R(0)  are  shown  in  Figs.  6-9. 

For  a)<l,  the  eigenfunctions  were  calculated  using  an  equispaced 
mesh  on  [0,1]  in  Eq.  4.3 

pn  =  Jf  (n-1)  ,  n=l ,  •  •  •  ,  41  (5.1) 

When  ui>l,  then  the  kernel  k(oi,p,t)  in  Eq .  3-2  is  zero  for  p  or 
t  in  [0,a)-l],  as  these  points  are  no  longer  in  the  support  of 
the  autocorrelation  in  Eq.  2.1.  Thus,  Eq .  2.9  becomes 

AA(p)  =  (K  A) (p)  ,  oi-l<p<l 

oi  —  — 

=  0  ,  0<p<oi-l  (5.2) 

So  if  A  /  0,  then  A(p)  =  0  for  pe[0,oi-l].  For  example,  let 
oi=1.2;  then 

A+  (p),  A~  (p)  =  0  for  0  <  p  < . 2  (5-3) 

1*2  1-2  —  “ 
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Thus,  the  optimum  pupil  functions  have  variable  amplitude  annular 
apertures  for  l<_u><_2 .  We  can  take  advantage  of  this  fact  for 
u»l  by  using  the  following  equidistant  mesh  on  [co-1,1] 

Pn  =  (a>-l )  +  i  (2-w)  (n-1)  (5.4) 

for  n=l ,  2 ,  •  •  • , 4l . 

Since  A*  is  the  largest  eigenvalue,  then  as  expected  from 

theory  [7],  the  corresponding  eigenfunction  A+(p)  will  not  change 

0) 

sign.  As  a  consequence  the  corresponding  optical  transfer  func¬ 
tion  is  also  one  sign  (i.e.,  T(co)_>0,  as  witness  the  solid  curves 
in  Pigs.  6-9).  When  co->.5,  since  A~  is  the  second  largest  eigen¬ 
value  then  A~  ( p )  will  change  sign  once.  However,  for  u><  .  5  ,  A~ 

0)  —  0) 

has  a  lower  ranking  and  A^(p)  should  oscillate  more  rapidly. 

These  oscillations  in  the  pupil  function  induce  corresponding 
oscillations  in  the  transfer  function.  All  these  features  appear 
in  the  graphs. 

Frieden  [6]  obtains  value  for  y+  and  A+(p)  by  expanding 

co  co 

A(p)  in  a  Fourier-Bessel  sampling  expansion  and  maximizing  the 

normalized  optical  transfer  function.  His  results  for  y+  are 

co 

in  close  agreement  with  our  values.  His  graphs  of  A^(p)  are 
also  in  general  agreement  with  our  functions,  although  his  eigen¬ 
functions  are  much  less  smooth  than  ours  for  co>l.  This  is 
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because  the  piecewise  constant  representation ,  Eq .  4. 
a  better  resolution  of  A*(p)  than  the  sampling  series 
for  m>1. 
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Pupil  functions 


that  achieve  the  upper  bound 


and  lower  bound  (-  -  -)  at  u  =  .4. 
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SECTION  4 


MOMENT  ESTIMATOR  APPROACH  TO  THE  RETRIEVAL  PROBLEM 
IN  COHERENCE  THEORY 

ABSTRACT 

The  purpose  of  this  paper  is  to  suggest  a  possible  approach 
to  the  recovery  of  the  spectral  density  function  g(oj)  through 
a  knowledge  of  the  first  few  measured  complex  zeros  of  the 
complex  degree  of  coherence  y(t).  The  assumption  that  y(i)  is 
band-limited  allows  us  to  express  the  sums  of  inverse  powers 
of  the  complex  zeros  of  y(t)  in  terms  of  the  moments  of  g(to). 
Only  the  lowest-order  moments  can  be  evaluated  in  this  manner 
with  any  accuracy  for  reasons  discussed  in  the  text.  We  use 
two  estimation-type  solutions  that  utilize  lower-order  moments: 
beta  distribution  model  and  the  Shannon  maximum  entropy  model 
to  estimate  g(CD).  Representative  numerical  calculations  are 
discussed. 
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The  purpose  of  this  paper  is  to  suggest  a  possible  approach  to  the  recovery  of  the  spectral  density 
function  gin>)  through  a  knowledge  of  the  first  few  measured  complex  zeros  of  the  complex  degree  of 
coherence  ytr).  The  assumption  that  yir)  is  band-limited  allows  us  to  express  the  sums  of  inverse 
powers  of  the  complex  zeros  of  yiri  in  terms  of  the  moments  of  giio).  Only  the  lowest-order  moments 
can  be  evaluated  in  this  manner  with  any  accuracy  for  reasons  discussed  in  the  text.  We  use  two 
estimation-type  solutions  that  utilize  lower-order  moments:  beta  distribution  model  and  the  Shannon 
maximum  entropy  model  to  estimate  giCD).  Representative  numerical  calculations  are  discussed. 


I.  INTRODUCTION 

The  complex  degree  of  temporal  coherence  itr)  and  the 
spectral  density  g{ui)  are  related  hv  the  integral  equation1  - 

->(rl=^  g(w)  c~"-'dw  <  r  <  oo.  (1.1) 

The  spectral  density  obeys  the  constraints 


(ai  glodSsO,  (1.21 

(b)  §  g(w)dw  =  1  =  >(()).  (1.2) 

Considerable  efforts  along  diverse  lines  have  been  expended 
in  attempting  to  invert  Eq.  (1.1)  when  only  the  modulus  of 
>(  r)  is  known.  The  literature  on  this  subject  is  vast,  but  some 
representative  references  which  1  have  found  useful  are  in- 
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dilated  in  Kefs.  2-11.  As  an  historical  note,  it  should  he 
mentioned  that  Akutowicz12-1 1  had  made  a  rather  complete 
study  of  this  problem  for  the  /.  situation,  anticipating  many 
subsequent  efforts.  He  was  well  aware  of  the  physical  prob¬ 
lem  and  quoted  the  x-ray  reconstruction  problem. 

Recently,  Napier14  has  been  able  to  measure  the  first  lew 
complex  zeros  of  y  tr  I  (see  Figs.  :t  and  7  of  his  paper)  Napier’s 
very  interesting  work  suggests  that  a  possible  approach  to  t In- 
inversion  of  Kq.  (1.1)  would  lie  through  a  knowledge  of  the  first 
few  measured  complex  zeros  of  y  (  r ). 

We  briefly  outline  the  approach,  leaving  the  details  to  the 
main  text.  We  first  assume  that  the  measuring  device  is  a 
bandpass  filter  so  that  there  is  an  upper  limit  to  the  possible 
measured  frequencies  in  Kq.  11.1)  This  requirement  suffices 
to  make  y(r)  an  entire  function  of  exponential  type  when  r 
is  extended  to  the  complex  r  plane.  An  entire  function  of 
exponential  type  has  an  infinite  number  of  complex  zeros:  let 
us  call  them  r„.  These  zeros  can  lie  ordered  in  terms  of  their 
moduli  so  that  |ri|  <  |Tj|  <  It  is  the  zeros  with  the  smaller 
moduli  that  are  measurable.14  We  express  the  power  mo¬ 
ments  (u j")  of  g(u)>  through  the  intermediary  of  the  cumu- 
lants  ofg’tu;)  in  terms  of  the  complex  zeros  of  y  I  r).  Although 
all  moments  can  he  evaluated  in  this  manner,  only  the  low¬ 
est  order  ones  can  be  evaluated  from  the  zeros  with  any  ac¬ 
curacy  for  reasons  discussed  in  the  text.  We  cannot  use  the 
moments  of  g(ui)  to  reconstruct  g(u>)  owing  to  the  ill-posed 
nature  of  the  moment  problem.1  A-lh  Given  these  consider¬ 
ations,  we  go  to  estimation-type  solutions  that  utilize  the 
lower-order  moments.  W'e  use  two  such  methods,  the  beta 
distribution  model  and  the  Shannon  maximum-entropy 
model,  to  estimateg(w)  in  the  important  case  when  it  is  uni- 
modal.  Finally,  some  comments  are  made  on  the  two-di¬ 
mensional  version  of  the  problem. 

II.  FORMAL  ANALYSIS 

To  begin  we  replace  the  limits  in  Eq.  (1.1)  by 

■y(r)  =  J  g(ud  e_l,“du)  0  <  |r|  <  <*>,  (2.1) 

where  both  b  and  a  are  finite  real  numbers.  An  essential  re¬ 
quirement  is  that  b,a  are  the  effective  cutoff  points  in  that  the 
integration  interval  cannot  be  reduced  without  altering  the 
value  of  the  integral.  y(r)  is  now  a  band -limited  function  and 
by  the  extended  version  of  the  Paley-Wiener  theorem17  is  an 
entire  function.  As  a  mathematical  note,  we  will  always  work 
in  the  context  of  Li  functions. 

There  have  been  several  studies  of  the  analytical  properties 
of  integrals  such  as  Eq.  (2.1),  but  the  most  useful  for  our 
purposes  is  that  of  Titchmarsh.18  Nussenzveig*  has  listed 
some  of  Titchmarsh ’s  theorems  that  are  of  relevance  to  the 
present  application  in  Sec.  Ill  of  his  paper.  The  most  im¬ 
portant  theorem  is  that  (in  our  notation)  y(r)  can  be  written 
in  the  form 

7(T)  -  e-iirtHfc+.hr  fj  (!  —  T/T„),  (2.2) 

n  •  1 

where  the  product  is  extended  over  all  the  zeros  of  y(  r ).  The 
product  is  conditionally  convergent.  This  expression  is 
simpler  to  use  than  the  usual  Hadamard  canonical  product 
representation1 7 19 


y(T)  =(•-■ .  11  111  -  tit,,  )e ;  ••].  (2.:ii 

>.  - 1 

y  ( r )  is  hermit  ian  symmetric,  y  I  -  r  I  =  y  *  I  r '  I .  because  gtwi 
is  real.  This  implies  that  it  r„  is  a  zero  of  y  <  r /.  then  so  is 
Thus  the  distribution  of  zeros  is  symmetric  wile  respect  tot  he 
imaginary  r  axis.  The  non-negativity  requirement  on 
further  restricts  the  zeros:  obviously,  none  ran  now  lie  upon 
t  he  imaginary  •  axis. 

W  t*  now  take  the  logarithm  of  both  sides  ol  Kq.  Since 

we  are  working  with  a  complex  variable  t.  we  confine  our  at 
tention  to  the  principal  branch  of  the  logarithm  and  denote 
if  by  l,og  in  lontormily  with  recent  usage.-’"  Thus  for  all  *> 
that  are  not  real  and  negative. 

Logy  =  Log|>  1  +  10,  12.41 

where  </>  is  the  unique  value  of  arg  y  satisfying  —  tr  <  y.  <  jr. 
This  is  consistent  with  the  convergence  test  for  an  infinite 
product  containing  complex  variables.-’1'  Thus  we  have 

Logy  t  t  )  =  1 1 ._,)(/)  +  a  U—ir  )  +  V  Log!  1  -  t/t„  I. 

»iTi 


The  infinite  series  containing  the  logarithm  can  he  expanded 


Log  |l  -  J  =  -  iJ  ‘  |~J  M<|r,|.  (2.(1) 

where  the  series  on  the  right-hand  side  converges  for  all  1  r [ 
<  J r j | -  Substitution  of  this  expression  into  Eq.  (2.5)  and 
subsequent  re-summing  hv  changing  the  order  of  summation 
yields 

Logy (t)  =  v  —  A’„(-,’rl''  |r|  <  |r,|.  (2.71 

.1  =  1  n\ 

The  K„  are  given  by 

K,  =  0/.,){b  +  a)-iZ  (T/)-',  (2.8) 

i 

Kn  =  -in(n  -  1 )!  V  ( 7-y )  ~ "  n  >  2,  (2.9) 

i 

and  are  essentially  the  cumulants  associated  with  g(o;).  The 
left-hand  side  of  Eq.  (2.7)  is  the  second  rharacte'  istic  func¬ 
tion21  extended  to  the  complex  r  plane. 

Returning  to  Eq.  (2.1),  we  expand  the  exponential  in  the 
integrand  and  integrate  termwise  (which  is  permissible).  The 
final  result  is 


>(r)  =  £  —  <  w)n )  (—i  t  )”, 
n  =  l>  n\ 


J> 


fitw'ldw 


are  the  moments  of  ui  about  the  origin.  These  moments  al 
ways  exist  and  are  finite  since  the  limits  of  integration  are  fi¬ 
nite.  The  infinite  series  converges  in  the  entire  complex  r 
plane  (i.e.,  it  has  an  infinite  radius  of  convergence  as  befits  an 
entire  function  of  exponential  type). 

The  cumulants,  in  themselves,  are  of  no  great  intrinsic  in¬ 
terest.  It  is,  rather,  that  they  serve  as  ancillary  functions 


enabling  us  In  determine  the  moments  in  terms  ol  the  complex 
zeros  r„. 

I  he  A„  and  (ui" )  are  related  by  a  sequence  of  nonlinear 
equations  that  can  lie  obtained  from 


L.  —  A„l-ir)"  =  Log  1+  v 
"■I  n  -i  n! 


(2.12) 


where  in  view  of  Kq.  (2.6),  |r|  <  |r,|  We  simply  differentiate 
hotli  sides  with  res|>ect  to  r  and  then  equate  (lowers  ot  I  — i  r  I. 
The  first  few  equations  are 


K  i  =  (w>, 

<wi)Ki  +  A  =  (u)1),  (2.13) 

t 1  >(u'J )A]  +  < « )  A  ■  +  ( 1  i A  i  *  t 1  i i(ui’*). 

C|Min  solving  for  the  <»•”)  in  terms  of  the  A.,  we  have 

<«>  =  A,. 

(“>-’)  =  A  ■  +  A;, 

(2.14) 

(u>A)  =  A  -f  3 A  A"  i  +  A  ,'. 

<w4>  *  A  ,  +  4A  ,A  ,  +  3A  +  HA  ,A j  +  A',1. 

It  is  a  well-known  property  of  these  functions  that  the  first  .V 
cumulants  are  determined  hy  th<  first  .V  moments,  and  rice 
is  r»o  I  hus.  we  have  established  relations  between  the  mo 
ments  (a'"  and  the  complex  zeros  r„  through  the  interme¬ 
diary  of  the  cumulants. 

Only  the  first  cumulant  contains  any  parameters  besides 
the  complex  zeros.  We  can  immediately  establish  the  rela 
t  ion 


i  Oil"1  =  ('/l.HA  +  o)  -  <u)>  (2.151 

i 

hv  using  the  first  equation  in  Kq.  (2.141.  In  applications,  the 
limits  A  and  a  are  known.  Furthermore,  the  mean  yu))  can 
generally  be  determined  by  independent  means  so  that  we  can 
consider  the  sum  on  the  left-hand  side  as  known.  This  is 
fortunate  because,  in  general,  the  partial  sums  of  (r/)' 1  will 
converge  very  slowly.  Kquation  (2.15)  has  already  been  de 
rived  hy  Titchmarsh.1"  albeit  in  a  different  fashion. 

Kquation  (2.15)  is  important  for  another  reason.  The  series 
on  the  left-hand  side  is  absolutely  convergent:  as  a  conse¬ 
quence  the  complex  zeros  tend  to  be  located  near  the  real  axis 
for  large  n.  As  Nussenzveig1-  has  carefully  pointed  out,  it  is 
the  zeros  located  close  to  the  origin  that  basically  determine 
the  shape  of  g(ui).  The  asymptotic  distribution  of  zeros  is 
determined  hy  properties  of  the  cutoff  and.  consequently,  does 
not  contain  much  information  about  the  shape.  We  thus 
expect  that  the  more  peaked  the  spectrum  in  the  interval  (a. A), 
the  farther  out  from  the  origin  will  be  the  zeros. 

The  reader  may  well  wonder  why  we  do  not  work  directly 
with  the  infinite  product  representations  of  >(  r)  as  given  by 
Kqs.  (2.2)  and  (2.3)  using  a  finite  number  of  complex  zeros. 
The  reason  is  simple.  Although  convergent  for  all  r,  these 
products  are  never  used  for  numerical  evaluation  because  the 
convergence  is-  iniformly  slow  in  any  closed  set  avoiding  the 
complex  zeros.  Having  only  a  finite  number  of  zeros  available 
simply  compounds  the  problem.  These  difficulties  are  well 
known  to  numerical  analysts. 


III.  COMMENT  ON  THE  MOMENT  PROBLEM 

A  knowledge  of  all  the  moments  (*•">  ofgiuil  suffices  to 
enable  one  to  reconstruct  g(w).  To  Ire  somewhat  more  precise, 
il  the  moments  of  a  density  function  on  a  finite  interval  arv 
given  then  they  can  lie  used  to  reconstruct  uniquely  the  un 
known  density  function  ithis  is  the  Hausdortl  moment 
problem).  Ill  the  mathematical  problem  the  moments  are 
assumed  known  to  unlimited  accuracy  It  is  tempting  to 
naively  applv  these  considerations  to  our  problem. 

(liven  that  we  have  the  measured  value  of  the  first  .V 
complex  zeros  then  we  can  obtain  numerical  estimates  of  all 
the  cumulants.  W  hen  n  is  large,  only  the  first  zer..  effectively 
Contributes  to  the  numerical  values  ol  the  cumulants  A,  as 
defined  by  Kq.  (2.9);  consequently. 

A„ - i"(n  -  I  l!rj""  n  »  1.  (3.D 

I  nlortunately,  the  accuracy  ot  the  cumulants  sutlers  in  two 
distinct  wavs.  The  lower -order  cumulants  are  in  error  lie- 
cause  a  large  number  ol  zeros  must  lie  known  More  the  series 
in  Kq.  1 2.9)  converges.  The  higher-order  cumulants  are  in 
error  liecause,  although  only  a  lew  complex  zeros  need  he 
known,  these  measured  zeros  are  raised  to  very  high  inverse 
powers  with  their  attendant  inaccuracy  (It  rt  is  in  error  by 
l'f .  then  its  2>>t h  power  is  in  error  by  20' V.  I  Krrors  in  the  A„ 
manifest  themselves  by  propagating  even  larger  errors  into 
the  (u?'1 )  via  Kq.  (2.14).  Although  we  can  obtain  numerical 
values  lor  the  moments,  they  are  all  subject  to  var\  mg  degrees 
of  error. 

The  moment  problem  on  a  finite  interval,  although  deter 
minate.  is  ill-jiosed  in  the  sense  that  very  small  changes  in  the 
moments  lend  to  large  changes  in  gM.  Obviously,  we  cannot 
usefully  employ  the  Hausdorff  procedure  to  obtain  g I u- 1  for 
the  type  of  problem  we  are  discussing 

(liven  these  considerations,  we  opt  for  estimation-type 
solutions  that  will  utilize  the  lower-order  moments  tin  man 
ners  to  be  specified!.  In  particular,  we  confine  attention  to 
the  important  case  where g(w)  is  unimodal  but  not  necessarily 
symmetric.  We  will  use  two  different  methods:  the  beta 
distribution  model  and  the  Shannon  maximum  entropy- 
model. 


IV.  ESTIMATION:  BETA  DISTRIBUTION  MODEL 

In  the  beta  distribution  model,  the  spectrum  is  assumed  to 
be  of  the  form 


,  ,  lui  -  a)P~'{b  -  w)i-' 

fi(u>)  =  — - 

Rip.qHb  -a)P*i-1 
Hip.q)  is  the  beta  function 


o  <  w  <  A 


Hip.q)  =  r<p)r<q)/r(p  +  q). 


(4.1) 


(4.2) 


We  shall  be  concerned  with  the  case  where  the  parameters  p 
and  q  are  such  that  p  >  1  and  q  >  1.  Under  these  circum¬ 
stances  g(w)  is  unimodal  and  vanishes  at  the  endpoints,  i.e., 
g(a)  =  g(A>  =  0.  By  varying  p  and  q  we  can  obtain  a  wide 
variety  of  shapes  in  the  basic  interval  (o  <  w  <  A).  The  mode 
lvalue  of  w  for  which  g(u>)  is  a  maximum|  is 


m»de  =  ^--1)b'Mg~1>a 
p  +  q  -  2 


(4.3) 


I 


H_l4 


1 


(4  10) 


TABLE  I  Values  of  cumulams  K2  and  X4  in  terms  of  first  2N  zeros  of 

tan _ 


% 

% 

H 

Kt 

Error 

-K,  x  10-  ‘ 

Error 

6 

0.09876 

21.0 

0.92859 

0.95 

8 

0.104  17 

16.7 

0.93304 

0.48 

10 

0.107.13 

13.8 

0.93496 

0.27 

12 

0.11025 

11.8 

0.93592 

0.17 

14 

0.11213 

10.3 

0.93645 

0.11 

16 

0.11358 

9.1 

0.93677 

0.08 

18 

0.11474 

8.2 

0.93697 

0.06 

20 

0.11568 

7.5 

0.937  10 

0.04 

OP 

0.12500 

— 

0.93750 

_ 

The  moments  about  the  origin  are 

1  "  M 

(«>)  =  — - -  £  (6  -  a)n~'a'fttp  +  /,</).  (4.41 

B{p,q)i.o\ll 

The  beta  density  function,  Eq.  (4.1).  is  one  of  a  class  of 
probability  density  functions  belonging  to  the  so-called 
Pearson  system  of  density  functions.--  These  density  func 
lions  are  distinguished  by  the  fact  that  they  are  determined 
bv  the  first  four  moments.  The  algorithm  for  determining 
the  four  parameters  p,  </.  a,  and  b  in  Eq.  (4.11  in  terms  of  <ui>, 
. . <u)4>  is  described  in  the  Elderton  and  Johnson  mono¬ 
graph-2  to  which  the  interested  reader  is  referred  for  full  de¬ 
tails.  A  short  summary  of  the  relevant  parts  of  the  algorithm 
is  included  in  the  Appendix. 

In  the  calculations  that  follow,  we  assume  that  the  mean, 
(ui),  is  known  from  independent  measurements.  We  consider 
two  cases.  In  the  first  case,  g(w)  will  lie  symmetric,  while  in 
the  second  it  will  be  nonsymmetric. 

In  the  first  situation  we  take 


£(<*>)  = 


(1  4-  w)-~l/2(l  -  ui)1— 1/2 

•2'B(i>  +  V2,  c  +  ‘/2) 


-1  <  u>  <  1, 


(4.5) 


where  v  >  (i.e.,  p  =  q  =  e  +  %).  This  function  is  sym¬ 

metric  about  its  mode  (located  at  u>  =  0).  The  corresponding 
7(t)  is  given  by2-1 


7<r)  =  P(1  +  v)JM)li.rl2y  7(0)  =  1,  (4.6) 


a."  «  — - ,  it;*'  » - . 

22(1  +  e)  24(1  +  e)2<2  ♦  cl 

Consequently, 

„  I  3 

n  .  = - ,  K.  *  - • 

2e  +  2  4(e  +  l)2(c  +  2) 

All  our  calculations  will  lie  |iertormed  for  e  =  3.  Values  of 
the  cumulants  H ,  and  K ,  in  terms  of  the  first  2,V  zeros,  j  ( /, 
•>f  J  J  r  l  are  listed  in  Table  I  The  rate  of  convergence  of  the 
partial  sums  of  H  >  is  depresstngly  slow:  .V  =  20  is  still  7-5T  m 
error,  whereas  the  corresponding  partial  -urn  for  Kt  is  only 
IMMI  in  error. 


Calculations  (via  t he  Pearson  algorithmi  were  carried  out 
to  estimate  r  for  values  of  K  and  H 4  corresponding  to  .V  =■  10 
and  .V  =  20.  The  estimated  g(ui(  are 


•V  =  10: 


N  =  20: 


*(u>) 


tl  -ul-')1224 

ft  1 0.5,2. 224 1 


1  724: 


/Null 


(1  -^-V7"4 
B«). 5,2.784 1 


c  ~  2. 254; 


compared  with  the  true  g(w) 


N  ■■ 


«( Ul)  = 


(I 


ul2)2* 


ft  (0.5, 3.5) 


r  =  3.000. 


The  results  are  summarized  in  Fig.  1.  Although  the  resultant 
estimates  /J(u.->  are  symmetric,  they  possess  a  larger  variance 
owing  to  the  fact  that  the  values  of  u?  =  0  are  so  low. 


Generally  shaking  we  have  lieen  concerned  with  the  "worst 
possible  case.  Actually,  the  variance  of  the  original  spectrum 
/?(<*>) can  usually  be  measured  independently  with  fairaccu 
racy  (say,  3%-5%).  Thus,  if  we  take  K  ,  Ithe  variance)  to  he 
4%  in  error,  thereby  effectively  halving  the  error  in  K2  for  .V 
=  20,  and  repeat  the  calculations,  we  obtain 


A’(ui)  = 


■  2.610. 


ft(0.5.3.110l 

The  new  estimate  is  shown  in  Fig.  2  and  does  not  require  any 
detailed  comment. 


and  c  >  —'fa.  The  zeros  of  7<r )  are  given  by  the  zeros  of  J„(t). 
Now  these  zeros  are  real  when  e  >  -V2,  (see  Watson23  for  the 
proof).  Following  standard  notation  we  denote  the  positive 
real  roots  by;,/.  They  are  extensively  tabulated.23-24  When 
the  cumulants  are  expressed  in  terms  of  the  zeros,  we  have 

Kine  1  =  —  (2n)!i2n+1  £  |0',/>-,2n+» 
f-i 

+  (-;..f)_,2"+l,l*0,  (4.7) 

K2„=  —  (2n  -  l)!i2n2  £  O'.l)"*" 

/-l 

=  -2(2n  -  1  )i2"<rl*»,  (4.8) 

where 

"i"**  £  O',./)-2'1  (4.9) 

/-1 

are  the  Rayleigh  sums.2:1  Rayleigh25*  showed  that 


For  our  second  example,  we  take 

*(«)  =  u>P-l(l  -  w)^"VR(p.(/)  0  <  u?  <  1.  (4.12) 


FIG.  1.  Beta  spectrum  with 
v  *  3  and  estimates: 
_ exact  spectrum:  —  N 

=  10;  ---  N=  20 


0 

(x) 


r 


FIG  2  Bela  spectrum  with  i 

*  3:  - exact  spectrum 

modified  estimate  with  N  = 

10 


We  lei  p  »  2.  q  -  3  in  the  numerical  calculations.  According 
to  Kq.  the  mode  is  located  at  u)  =  and  the  curve  is 

asymmetric.  The  function  >(r)  corresponding  to  this  ex¬ 
pression  is 

><r)  =  Aftp.p  +  (4.13) 

where  A/ta.c.ur)  is  the  confluent  hypergeometric  function.-* 
In  our  particular  case. 

■y(r)  =  A#(2,3,— t'r) 

=  2(ir)~2  f*  xe-’dx.  (4.14) 

Ja 

by  virtue  of  Kq.  ( VI )  on  p.  46  of  Ret.  26.  Consequently, 

y(r)  =  2(ir)_2|l  -  0'T)e~"  -  e-i'|.  (4.15) 


These  estimates  are  plotted  along  with  the  true  g(w)  in  Fig. 
3.  The  estimated  /»(<*>)  corresponding  to  N  •  10  leaves 
something  to  be  desired,  however;  that  for  .V  =  20  is,  in  my 
opinion,  very  reasonable  As  before,  this  is  "worst  case." 
W  hen  we  lake  K  >  to  have  a  4fi  error  corresponding  to  an  in 
dependent  measurement,  then  the  new  estimate  is 

0,1042(1  _  W)2.103 
J?(uj)  *  — — - — - • 

All. 042,2.103) 

Although  we  will  not  plot  this  estimate,  it  is  very  close  to  the 
trueg(uil. 

V.  MAXIMUM  ENTROPY  SOLUTION 

(liven  that  we  have  numerical  estimates  of  the  first  .V  mo¬ 
ments.  we  wish  to  employ  this  inlormalion  to  approximate 
#(w)  without  the  intervention  of  an  assumed  glurl  as  in  Sec. 
IV.  Following  Shannon  '  and  Jaynes,-’*  we  seek  the  mini 
mallv  prejudicial  assignment  of  glut)  by  maximizing  its  en¬ 
tropy 

H  =  ~  £  g(u>)  log|g(o,)|do,  (5.1) 

suhject  to  the  power-moment  constraints 

(w")  *  S  u>nB^ul^u:  n  =  1......V,  (5.2) 

where  the  (u!n)  are  given. 

This  is  a  standard  problem  in  the  calculus  of  variations. 
The  furmal  solution  is 


The  complex  zeros  of  thi .  function  were  evaluated  using  the 
algorithm  developed  by  Delves  and  l.yness27  to  locate  and 
compute  all  the  zeros  of  a  given  analytic  function  that  lie  in 
a  given  region.  It  might  be  said  that  this  is  a  complicated 
problem.  Nevertheless,  we  know  the  theoretical  values  of  the 
cumulants;  they  are 

Ki=%,  K>  =  '^s.  K,  =  %75,  K4  =  %7 so.  (4.16) 
The  relevant  data  is  summarized  in  Table  II. 


g(u)  =  g0exp|-  £  Ana,’'|.  (5.3) 

where  A„  are  the  unknown  I.agrange  multipliers  and  g0  is  a 
normaliz  ng  constant  such  that  Eq.  (1.3)  is  satisfied.  The  A„ 
are  required  to  satisfy  the  constraints  that  the  first  N  mo¬ 
ments  ofg(ui)  equal  the  first  N  moments  ofg(ui).  This  leads 
to  iV  simultaneous  nonlinear  equations  for  the  A„.  These 
equations  can  be  solved  by  Newton-Raphson  iteration.  We 
omit  the  details. 


Calculations  were  again  carried  out  to  estimate  p  and  q  for 
values  of  i  he  cumulants  corresponding  to  ,V  =  10  and  .V  =  20 
of  Table  II.  The  estimated  g(ui)  are 

N  =  10: 

o)u  24l(l  -u,)<’8«7 

£(<*>)  =  — - ; 

fi<  1.241,1.887) 

N  =  20: 

Ul'  040(1  _  ull.828 

g(o>)  =  — - . 

8(2.040,2.828) 


TABLE  II. 
(2.3.— rr). 

Values  of  cumuiants 

in  terms  of 

first  2N  zeros  of  M 

Exact 

N  =  20 

* 

N 

© 

Ki 

0.4 

k2 

0.04 

0.03709 

0.03501 

K.3 

0.002286 

0.002237 

0.002191 

K,  _ 

-0.001029 

—0.001 028 

-0.001 027 

Some  numerical  calculations  have  been  carried  out  to  test 
this  estimation  algorithm.  The  results  cannot  be  said  to  be 
as  useful  as  one  would  hope  for.  One  of  the  main  difficulties 
is  that  gia)  ^  0and£(6)  x  0.  This  causes  a  severe  distortion 


FIG.  3.  Nonsymmetric  beta 
spectrum  and  estimates: 

- exact  spectrum;  —  N 

=  10:  ---  N  =  20 
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FIG  4  Nonsymmetnc  beta 
spectrum.  - exact  spec¬ 

trum.  -  -  -  maximum  entropy  es¬ 
timate  given  exact  moments 


of  the  estimated  /((at.  For  example.  g(u:)  given  hy  Kq.  14.12) 
was  numerically  tested  assuming  that  the  exact  values  of  the 
cumulants  (and  hence  the  moments)  are  given  bv  Kq.  (4.16). 
I  he  result  of  such  a  calculation  using  maximum  entropy  is 
given  in  Fig.  4.  The  estimated  curve,  given  the  exact  mo¬ 
ments.  is  in  error  owing  to  the  nonzero  behavior  at  the  end 
points.  Calculations  with  data  given  in  Table  II  were  even 
worse! 

A  modification  of  the  algorithm  to  require  the  a  priori 
constraints  fUa )  =  4(h)  =  0  would  probably  improve  the  sit¬ 
uation.  at  the  cost,  although,  of  an  even  more  involved  algo¬ 
rithm.  We  hope  to  undertake  these  modified  calculations  in 
the  near  future. 

VI.  COMMENT 

It  is  tempting  to  generalize  these  results  to  the  two-  (or 
more)  dimensional  problem 

T<Ti-r2)  =  y  (6.1) 

However,  no  straight  forward  genet  alizat  ion  is  possible.  When 
■V(Ti,t2)  is  considered  as  a  function  of  two  complex  variables, 
then  it  is  an  entire  function  of  these  two  variables  as  discussed 
in  Konkin."1  The  difficulty  is  that  •> ( z , . r _. )  does  not  possess 
any  kind  of  Hadamard  factorization  because  the  zeros  of 

'l.rz)  are  not  completely  discrete.  The  zeros  can  form  a 
mai  ifold  of  values.-"1  Consequently  we  cannot  express  the 
cumulants  in  terms  of  simple  sums  of  inverse  powers  of  the 
complex  zeros. 
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APPENDIX 

The  beta  spectrum.  Kq.  (4. 1 ).  contains  four  parameters:  a. 
b.  p,  and  q.  The  procedure  for  determining  these  parameters 
is  to  equate  the  sample  and  population  values  of  the  first  four 
moments  and  to  solve  these  four  simultaneous  nonlinear 
equations  for  the  four  unknown  parameters.  Now  the  sample 
values  of  the  first  four  moments.  Eqs.  (2.14),  are  expressed 
directly  in  terms  of  the  cumulants. 


I  he  recipe  is  as  follows:  I*  irst  calculate  the  parameters  iij 
and  dj  from  the  sample  moments: 

tfi  =  <<ui-  <u))3>*/<<«)-  <ui))2>:1.  (All 

=  <(ui  -  <*»:> )4 >/< (az  -  (ai>|^)2.  IA2) 

d,  measures  the  skewness  and  i,  (he  "flatness"  ofg(u>).  Both 
of  ,ht‘s‘‘  parameters  can  lie  written  in  terms  of  the  cumulants: 
in  fact. 

lii  -  K'i/Ki  (A.D 

&>«  <K«  +  (A4i 

Next,  calculate 

r  =  61  dj  -  di  -l)/<6  +  .Id,  -  2dz>.  (An) 

I  he  parameters  p  and  q  are  determined  from  r, 

P.V  =  V.  fir  -  2)  ±  rtr  +  2) 


xl- — * _ y- 

+  2M  +  ltitr  4-  1 )/ 


withp  <  v  if  di  >  0  and  p  >  q  it'd,  <0.  There  are  additional 
formulas  to  evaluate  u  and  h.  but  we  need  not  worry  a  limit 
them,  full  details  of  the  complete  algorithm  are  given  in 
Klderton  and  -lohnson 
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APPENDIX 


THE  SHANNON  NUMBER  OF  AN  INCOHERENT  DIFFRACTION  IMAGE 

ABSTRACT 

The  Shannon  number  of  the  incoherent  diffraction  image  of 
a  finite  object  is  defined  as  the  product  of  the  Strehl  criterion 
and  the  object  area.  The  general  situation  where  the  point 
spread  function  is  asymmetric  is  discussed  and  the  Shannon  number 
is  expressed  in  terms  of  the  singular  values  and  singular  func¬ 
tions  of  the  convolution  kernel  integral  equation  governing 
image  formation.  When  the  point  spread  function  is  symmetric, 
only  the  singular  values  enter  into  the  description  of  the 
Shannon  number. 


f 
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The  purpose  of  this  note  is  to  develop  an  expression  for 
the  Shannon  number  of  the  incoherent  image  of  a  finite  object, 
see  Eq.  11,  in  the  general  case  where  the  point  spread  function 
is  asymmetric. 

Assuming  that  the  isoplanatic  condition  holds,  then  the 
relation  between  an  incoherently  radiating  object  o(x)  of 
finite  size  (a,b)  and  its  corresponding  diffraction  image  h(x) 
is  given  by  the  convolution  relation  [1] 

rb 

h ( x )  =  j  t(x-y)  o (y )  dy  ,  jx|  <  °°  ,  (1) 

a 

where  t(x)  is  the  point  spread  function  of  the  optical  system. 
The  functions  h,  t,  and  o  are  real  and  nonnegative. 

The  kernel  of  this  integral  equation  is  generally 
asymmetric  (such  as  for  systems  suffering  from  coma).  The 
formal  procedure  for  solving  Eq .  1  is  the  Schmidt  [2]  theory 
of  singular  functions  —  singular  values.  It  basically  amounts 
to  symmetrizing  the  kernel  t(x-y).  The  first  step  in  the 
procedure  is  to  form  the  right  and  left  kernels 

rb 

tK(x,y)  =  I  t(x-z)  t(y-z)  dz  (2) 

a 

fb 

tL(x,y)  =  J  t(z-x)  t(z-y)  dz  .  (3) 

a 
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It  can  be  shown  [2]  that  these  two  kernels  are  symmetric  and 
nonnegative  definite.  Even  though  the  original  kernel  t(x-y) 
is  a  function  of  the  difference  (x-y),  there  is  no  guarantee 
that  tR  and  tR  can  be  written  as  such. 


The  functions  that  replace  the  eigenfunctions  for  the 
symmetric  kernel  are  termed  singular  functions .  They  are 
^n(x)  and  <J>  (x),  associated  with  the  right  and  left  kernels, 
respectively,  and  satisfy  the  homogeneous  integral  equations 


a 


2 

n 


*n(x> 


rb 

tR(x,x  '  ) 
a 


n(x*)  dx' 


(4) 


°n  *n(x) 


tR(x ,x ' )  n(x* )  dx'  , 
a 


(5) 


where  are  the  eigenvalues  of  ^n(x)  and  also  of  \pn(x) .  The 
are  termed  the  singular  values,  and  can  be  ordered: 
Oi>02>o3>. . .>0.  The  are  not  the  eigenvalues  of  t(x-y), 
and  indeed  the  kernel  t(x-y)  need  not  have  eigenvalues.  The 
singular  function  sets  {^n(x)}  and  {<(>  (x)}  are  complete  and 
orthonormal  (under  fairly  weak  mathematical  restrictions  which 
are  certainly  satisfied  in  the  situation  considered);  thus, 

>b  b 

OnU)  ^(x)  dx  =  *n(x)  ^(x)  dx  =  1  .  (6) 

a  a 
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When  the  kernel  t(x-y)  is  symmetric,  then  the  singular  values 
and  singular  functions  reduce  to  eigenvalues  An  and  eigen¬ 
functions  <()  (x) 

Xn  =  an  5  'in(x)  =  ^n^  * 


The  kernel  t(x-y)  admits  the  expansion 
00 

t(x-y)  =  I  oj  (x)  i(y)  .  (8) 

n=0  n  n 


We  can  utilize  Eq.  8  to  obtain  an  expression  for  the 
Shannon  number  of  the  incoherent  diffraction  image  of  a  finite 
object.  In  analogy  with  the  investigations  in  [3-8]  with 
respect  to  coherent  imagery,  we  define  S  as 

S#  =  t (o) (b-a)  ,  (9) 


i.e.,  the  product  of  the  point  spread  function  evaluated  at 
the  paraxial  image  point  (Strehl  ratio)  and  the  object  area 
(b-a) . 


ft 

S  can  be  expressed  in  terms  of  the  singular  values  and 
singular  functions.  Set  x  =  y  in  Eq.  8  and  integrate  over 
the  object 


rb 

t (o )dx 
a 


00  rb 

l  a  4>  ( x )  ¥n(x)  dx 
n=o 


(10) 


It  follows  that 


(11) 


S*  =  I 
n= 0 


k  a 
n  n 


5 


where 


k 


n 


■b 

<f>n(x)  ^n(x)  dx 
a 


(12) 


Note  that  kfi  <  1,  with  equality  holding  only  when  t(x-y)  is 
symmetric;  in  which  case  Eq.  11  reduces  to 
00 

s*  =  l  X  (13) 

n=0  n 


an  expression  previously  derived  by  Gori  [5]. 

We  can  interpret  the  k^  as  a  measure  of  the  degree  to  which 
the  point  spread  function  is  asymmetric.  The  Shannon  number  for 
an  object  imaged  with  an  asymmetric  point  spread  function  is 
not  equal  to  the  Shannon  number  for  the  same  object  imaged  with 
a  symmetric  point  spread  function.  Although  (b-a)  is  the  same 
in  both  cases,  t(o)  for  a  symmetric  point  spread  function  is 
the  maximum  value  of  t(x),  whereas  the  maximum  value  of  an 
asymmetric  point  spread  function  is  generally  not  at  x  =  0 ,  but 
at  some  nonzero  value  of  x.  This  is  evidenced  in  the  fact  that 
kn  <  1.  Thus ,  the  singular  functions  4>n  and  <Pn  as  well  as  the 
singular  values  o enter  into  the  determination  of  S  for  an 
asymmetric  point  spread  function.  Only  when  the  point  spread 

function  is  symmetric  do  the  singular  functions  drop  out  of  the 

u 

description  of  S  . 
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SUMMARY 


The  purpose  of  this  section  is  to  summarize  the  technical 
accomplishments  of  the  contract.  In  this  it  covers  both  the 
interim  report  and  the  present  final  report. 

The  main  theme  of  the  contract  is  the  study  of  methods  of 
obtaining  the  wavefront  aberration  function  of  an  optical  system 
from  measurements  of  functions  that  are  physically  related  to  it. 
Thus,  the  main  problem  is  one  of  inversion  in  the  presence  of 
noisy  data.  Irrespective  of  the  methods  actually  employed,  these 
inversion  (or  deconvolution)  problems  are  numerically  unstable 
because  of  lack  of  sufficient  information  from  the  response 
measurements  to  infer  the  correct  solution.  Our  basic  approach 
has  been  to  augment  the  data  with  *  additional  knowledge  of  the 
nature  of  the  quantity  being  inverted  in  order  to  make  the 
deconvolved  solution  (i.e.,  unknown  aberrated  wavefront)  at  least 
physically  meaningful.  A  variety  of  techniques  (most  unknown  to 
the  optical  community)  have  been  utilized  and  also  developed  to 
carry  out  these  calculations. 

We  considered  four  aspects  of  this  problem,  they  are: 

1.  Wavefront  from  the  measured  point  spread  function 
(Sec.  1  of  interim  report). 
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2.  Wavefront  from  influence  functions  (Sec.  3  of 
interim  report ) . 

3.  Wavefront  from  interferometric  data  (Sec.  2  of 
interim  report  and  Sec.  2  of  final  report). 

4.  Beginning  development  of  theory  to  obtain  the  wave- 
front  by  numerically  stable  projection  operators 
(Sec.  1  of  final  report). 

In  Item  1,  the  wavefront  and  point  spread  function  are 
related  by  a  nonlinear  integral  equation  of  the  first  kind.  The 
inversion  problem  was  cast  as  a  nonlinear  least  squares  estima¬ 
tion  problem  for  the  values  of  the  aberrated  wavefront  at  N  points 
over  the  aperture,  given  that  M  (M  >  N)  noisy  measurements  of  the 
point  spread  function.  Filtered  singular  value  decomposition  was 
employed  in  order  to  overcome  the  ill  conditioning  due  to  noise 
in  the  measurements.  Attention  was  drawn  to  such  difficulties  as 
nonuniqueness ,  sensitivity  of  algorithm  to  initial  guess,  etc. 
Illustrative  numerical  examples  were  presented;  on  the  basis  of 
these  calculations  it  can  be  said  that  this  method  offers  great 
promise  for  actual  application.  The  main  drawback  appears  to  be 
that  it  is  slow  and  needs  a  large  computer  (for  two-dimensional 
situations ) . 
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In  Item  2,  the  wavefront  and  the  influence  function  over 
linearly  related  with  the  influence  function  determined  empiric¬ 
ally  from  an  independent  set  of  measurements.  The  method  of 
singular  value  decomposition  is  outlined  for  this  problem. 

Actually  it  is  even  more  ill  conditioned  than  the  nonlinear  prob¬ 
lem  in  Item  1.  For  the  ill-conditioned  influence  matrix,  we  must 
accept  the  fact  that  its  rank  is  poorly  determined  numerically 
and  changes  as  the  (experimentally  determined)  matrix  elements 
vary  by  very  small  amounts.  Although  this  method  has  been  used 
in  prior  work  by  one  of  the  main  contractors,  the  numerical  diffi¬ 
culties  in  handling  this  problem  were  unknown  to  them  until  the 
principal  investigator  discussed  the  sensitivity  aspects  last 
spring.  If  this  method  is  to  be  used,  more  work  must  be  done  to 
determine  error  bounds  because  the  influence  function  as  well  as 
the  input  data  are  subject  to  uncertainty  in  the  presence  of 
noise.  In  anticipation  of  this,  the  principal  investigator  has 
continued  to  study  and  to  develop  stable  numerical  algorithms 
for  this  situation  (work  done  on  his  own  time) . 

Item  3  is  a  somewhat  different  view  of  the  same  problem,  in 
that  we  utilize  interferometric  data  to  determine  the  aberrated 
wavefront.  One  problem  that  was  solved  was  the  extension  of  the 
Zernike  aberration  theory  for  constant  amplitude  circular  aper¬ 
tures  to  annular  apertures  having  a  bell  shaped  radial  taper 
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(see  Sec.  2  of  interim  report).  These  aberration  functions  are 

used  as  basis  functions  for  the  interferometric  data  reduction 

(see  Sec.  2  of  final  report).  Both  L2  and  L  norm  reductions  of 

the  data  are  discussed.  It  is  pointed  out  that  the  L  norm  is 

2 

not  robust  with  respect  to  outliers  in  the  noisy  fringe  data 
that  we  must  expect  in  hostile  environments.  The  L  norm  is 
known  to  be  robust  in  such  situations,  and  a  linear  prop-ramming 
algorithm  is  advocated. 

Item  ^  is  an  attempt  to  develop  an  algorithm  that  will  pro¬ 
vide  a  systematic  approach  to  the  ill  posed  problem  of  extrapola¬ 
tion  of  noisy  data  such  as  is  required  in  some  attempts  to  deter¬ 
mine  the  wavefront.  In  this  it  is  an  improvement  over  previously 
proposed  iterative  algorithms  of  which  the  best  known  is  the 
Gerschberg-Saxton  algorithm.  The  method  employes  projection 
operators  in  conjunction  with  iterative  Galerkin  techniques. 
Numerical  examples  discussed  in  the  text  show  great  promise. 
Because  the  current  contract  has  been  terminated,  extensions  of 
the  analysis  to  more  complicated  situations  is  held  in  abeyance. 

In  addition  to  these  four  major  items,  there  are  three  addi¬ 
tional  topics  that  grew  out  of  the  mainstream  of  the  research. 
They  are  important  in  their  own  right.  Section  3  of  the  final 
report  is  entitled:  Upper  and  lower  bounds  on  radially  sym¬ 
metric  optical  transfer  functions.  This  problem  has  been  of 


i 


1 


interest  to  the  optical  diffraction  community  for  over  30  years. 

In  contrast  to  the  rectangular  aperture  case,  the  least  upper 
bound  and  greatest  lower  bound  ar  found  to  differ.  The  pupil 
functions  which  achieve  these  bounds  at  a  given  spatial  fre¬ 
quency  are  determined  along  with  the  associated  transfer  functions. 

While  working  on  Item  1,  an  unsucessful  attempt  was  made  to 
employ  moment  methods.  However,  it  was  found  that  moment  methods 
could  be  used  to  attack  the  retrieval  problem  in  coherence  theory. 
Section  4  of  the  final  report  summarizes  the  analysis. 

The  final  item  which  is  listed  as  Appendix  to  the  final 
report  is  entitled"  The  Shannon  number  of  an  incoherent  diffrac¬ 
tion  image.  During  the  course  of  this  investigation,  the  prin¬ 
cipal  investigator  engaged  in  conversations  with  scientists  from 
several  universities  who  are  interested  in  wavefront  deconvolu¬ 
tion  for  astronomical  seeing  purposes.  It  was  thought  that  the 
Shannon  number  of  a  diffraction  image  could  be  a  useful  merit 
factor  during  deconvolution.  Unfortunately,  the  Shannon  number 
as  worked  out  by  these  scientists  only  holds  for  symmetric 
diffraction  images.  The  purpose  of  this  appendix  is  to  develop 
the  Shannon  number  for  an  asymmetric  diffraction  image  and  to 
show  that  in  this,  the  more  realistic  situation,  the  amount  of 
information  required  to  calculate  the  Shannon  number  is  the  same 
as  that  required  to  solve  the  original  problem. 
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